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ABSTRACT 


Punjabi is a modern Indo-Aryan (Indic) language spoken primarily in the Punjab 
states of both India and Pakistan. Punjabi uses either Gurmukhi script (written from 
left to right), a Brahmi-derived script or Shahmukhi script (written from right to left), 
a Perso-Arabic script. The phonology discussed herein relates to Gurmukhi script as 
it is used by speakers in Punjab and also majority speakers across the globe. The 
characters in the script are normally aligned below the line of grapheme. Punjabi has 
concatenative morphology i.e. many words can be created using a root word and 
adding various morphemes. Punjabi and Dogri are the prominent tonal languages in 
this family. Lexical tone in Punjabi is utilized to distinguish words. The vowel is the 
tone bearing unit in Punjabi. A word has a single tone, which may co-occur with 


stress on the syllable. 


The lexicon is the bridge between a language and the knowledge expressed in that 
language. The lexicon in any system plays an important, dynamic and necessary part 
in the syntactic and semantic fields. Monolingual (Punjabi) and bilingual (English & 
Punjabi) dictionaries are available in printed as well as e-form containing information 
such as meaning, pronunciation etc. A Pronunciation Lexicon for machine learning is 
required for development of speech systems. W3C (World Wide Web Consortium) 
has defined Pronunciation Lexicon Specification (PLS) - version 1.0 (2008) which 
covers the multiple Pronunciation and multiple orthography in the XML structure at 
the lexicon level thus providing the flexibility of creating language specific PLS 
documents. The current version of PLS 1.0 is a broad based base line specification 
which covers the requirements of Latin script based languages and can be used for all 
global languages. The specification document also cites few examples for Japanese 
and Chinese. The requirements of many other global languages such as Indian 
languages haven’t been discussed in this document. 
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The grammatical information is relatively encoded in its morphology than syntax in 
Indian languages unlike English where the grammatical information is an integral 
part of the syntax. Hence there is a need to augment PLS structure to broadly cater to 
Indo-Aryan languages. 

The main objective of the thesis is to evolve a Pronunciation Lexicon Specification 
for Punjabi Language within W3C Framework. This Framework is expected to 
capture the phonological features of the Punjabi language and provide phonetic 
evidence for them through the experimental study of recorded speech signals of the 
Malwai dialect of the language. The phonetic experiments involve collection of data, 
recording, data segregation, annotation and analysis. The data have been sourced 
from published Punjabi Dictionaries. The phonetically rich and frequently occurring 
words of Punjabi were collected for phonological analysis covering all phonemes, 
tonemes, consonant /h/ and conjuncts of /f/ and schwa vowel of Punjabi. 
Monosyllabic/ disyllabic/ trisyllabic/ polysyllabic words containing various 
combinations of light, heavy & super heavy syllables were selected for study of 
lexical stress. Ten informants (4 male and 6 female) between 25-40 age group 
belonging to the rural, town and city background were identified. Total of 4000 
words were recorded for 10 speakers and 50 sentences across 8 speakers for prosodic 
study. Data was recorded in the laboratory through good quality audio recording 
devices in standard speech and noise free environment having SNR>=45db as per 
standardized procedure for speech corpora development based on the ITU 
recommendations. The speech tools such as Praat, Gold Wave etc have been used for 
speech analysis. Other parameters such as frequency formants, duration and intensity 


of syllables in a word were also recorded for prosody analysis. 


The discussion on tone has been covered under two categories of tone i.e. 
Independent Tone and Tone arising from Supra-laryngeal Consonants. For this words 
with each of five tonemes in initial, medial and final syllable of the word have been 
compiled ensuring the phonetic coverage in terms of various vowels, dipthongs, 
nasalization, gemination and other co-articulation parameters such as occurrence of 
Toneme as onset/ coda in above contexts across monosyllabic, disyllabic, trisyllabic 


and polysyllabic words. 


For Independent tones, the words containing consonant /h/ in initial, medial and final 
syllable of the word were considered to examine these characteristics. Half /h/ does 
not occur in the initial syllable hence words containing conjuncts of /f / in medial and 
final syllable were considered. The spectrographic analysis using PRAAT of all the 
male & female samples was carried out. The duration, fundamental frequency (Fo), 
quarter wise slope of the vowel associated with the Tone Bearing Vowel (TBU) have 
been recorded. The tabulation of data has been done for three categories of words i.e. 
tonemes, consonant /h/ & conjuncts of /h/ capturing the variety of acoustic 
environments. The objective of experimental work carried out was to corroborate the 
tone rules of Punjabi as collated through the literature survey. These rules have been 
experimentally verified and are applicable by and large. The detailed analysis has 
been presented in the thesis based on experimental observations which lead to 
discovery of allotones and findings on tone variations due to various co-articulatory 


factors. 


Stress is not a prominent feature of Punjabi, as in other Indic languages, however it is 
utilized in di-syllabic words to distinguish between grammatical categories. The text- 
to-speech technology uses concatenative approach which results in artificial 
production of speech and lacks prosody. The research on intra-syllabic stress at the 
lexical level can help bridge this gap. The empirical study has been used for this 
purpose. To start with, non-tonal words were taken as basis for determining the inter- 
relationship between syllables in a word to report the heaviest syllable which is the 
carrier of stress and can be utilized by TTS system for natural production of speech. 
The acoustic parameter of syllabic weight has been modeled using Linear Regression 
for relational analysis. The duration, Pitch and Intensity of both the syllable in a word 
averaged across 10 speakers for 95 words was tabulated in a spreadsheet and was 
plotted using Curve Expert Professional. This is a cross-platform solution for curve 
fitting and data analysis. The linear equations of all the three acoustic parameters i.e. 
Intensity (1), Pitch (P), Duration (D) which influence the lexical stress were obtained 
using Piece-wise curve fitting technique. The normal distribution curves for all the 
three acoustic parameters were plotted using Standard Deviation and mean of the data 


averaged over two syllables. 


Analyzing the above functions driven from the stress patterns of the recorded 
samples, the corresponding weightage factors of the acoustic parameters viz I, P & D 
have been calculated. The different categories of data i.e. Di / Tri / Poly-syllabic non- 
tonal words, Di / Tri / Poly-syllabic tonal words were analyzed to find out the 
probability of occurrence of a score by standardizing the scores, known as z scores. 
Using this value of z-score, a value will be obtained from Z table, which gives the 
probability of the score. The lexical stress pattern for the given range of the data is 
obtained using 80-20 rule. The stress rules have been evolved based on minimum 
80% probability of occurrence of that rule in the given data being analyzed using 
above defined heuristic approach. Based on this experimental study, the rules have 


been proposed for marking lexical stress in PLS for different categories of words. 


Schwa is an important part of the vowel space but is considered as a weak vowel as 
compared to other vowels. The Schwa has been the subject of much research by 
phonologists yet substantially less consideration has been dedicated to the study of 
the phonetic attributes of Schwa vowel. In Punjabi, Schwa is a short neutral vowel, 
which sounds like every single other vowel, however its exact quality changes 
depending upon the adjacent consonants. The words containing Schwa in different 
positions and occurrence of these words in sentences has been considered in various 
phonetic contexts i.e. Word-initial Schwa or inherent Schwa in a Consonant Cluster 
(CC) and also Schwa as a tone bearing unit, Schwa with Nasalization, Schwa 
associated with geminated consonant as onset, Schwa as Release Vowel. The work 
has also been carried out for observing quality of Release Vowel in a sentence viz-a- 
viz isolated word in same acoustic context. The different parameters i.e. Fundamental 
frequency (Fo), Formants Fl, F2, acoustic space in terms of Fl and F2, Intensity, 
Duration, Slope of Fo contour over TBU and Burst Energy (BE) have been used to 
study the variations in the quality of schwa in these contexts. These variations have 
been discussed in terms of Vowel height and Vowel front-ness / back-ness. POS is an 
available source for feature extraction for building NLP & speech systems as Punjabi 
are a highly inflectional language. There is a need to develop prosodic PLS based on 
morphological and overriding phonological features such as stress, tone, germination, 


nasalization etc. 


The complete phonological coverage of PLS data needs to be ensured by 
incorporating words that contain maximum inflection under each POS category. This 
can be a useful resource for training of speech systems. The thesis discusses the 
prosodic features of Punjabi with the help of examples along with IPA transcription 
and POS information. The various POS inflections in Punjabi such as prefix/Suffixes 
with change in grammatical categories have been presented. The distinctive features 
of morphology-phonology interface of Punjabi language will be discussed in this 
thesis such as Tone, Nasalization, Gemination, Word variants in Homonyms, 


Homographs, Homophones, borrowed words, Abbreviations etc. 


The Framework for Pronunciation Lexicon Specification for Punjabi Language 
(PLS 2.0) with addition of new element tags and attributes has been proposed. The 
thesis represents the sample PLS data in XML conformance with PLS 2.0 framework 
of different POS categories such as Noun, Pronoun, Adjective, Adverb, 
Demonstrative words, Verb, Postposition, Conjunction, Homographs and Multi-Word 


Expressions such as echo words, duplicates etc. 


The phonological research findings of the present study can be leveraged to 
implement a computational Phonology model for Punjabi language. This can also be 
utilized to build large word level speech corpus containing prosodic information, 
syntax and semantics that can be used for development of Punjabi Text-to-speech 
(TTS) Systems, Language Identification Systems, and Speech Recognition Systems. 
The foundational work done for Punjabi prosody in this thesis can provide a strong 
basis for future research in areas such as co-articulation modelling of Punjabi, 
prosodic features based modelling techniques for language recognition and the 


extension of work to other Indo-Aryan languages. 


Chapter 1 


Introduction 


1.1 Research Problem 


The Pronunciation Lexicon Specification (PLS 1.0) has been designed by World 
Wide Web Consortium (W3C) with a goal to have inter-operable specifications of 
pronunciation information which can be used for speech technology development. 
This specification provides the possibility of providing multiple pronunciations for 
the same orthography as well as multiple orthographies against an entry of single 
pronunciation in the PLS. Lexical phonology assumes that all word formation, 
including inflection, is carried out in the lexicon as discussed Clements & Keyser 
(1983). As the morphology and part of phonology are carried out in the lexicon, the 
nature of syllable needs to define in terms of nature of nucleus, as nucleus is 
considered to be a prosodic category. It also defines type of onset and coda and such 
definitions are language dependent. These features are not discussed in the current 
PLS 1.0 specification. Among Indo-Aryan languages, tonal feature of Punjabi makes 
it more complex. The major hurdle in creating PLS for Punjabi is to capture the 
pronunciation as properly understood by a native speaker. Thus the new elements 


need to be identified for making PLS morphologically and phonologically richer. 


The theory of Generative Phonology Chomsky & Halle (1968), is concerned with 
generation of rules that apply to the phonemic level of representation to yield the 
phonetic level of representation. It treats the phoneme as a bundle of features. The 
generative phonological approach gives equal importance to theoretical concepts and 
principles and the facts of data analysis. 

The primary concern of Generative Phonology is the development of the rules that 
deal with the pronounceability of the strings ‘generated’ by the syntactic component 


of the grammar. 


The generative approach to phonological analysis begins by stating the syntactic 
structure, passes this on to phonology, which can use futher any relevant syntactic 
facts. According to this theory, words are fully syllabified at the level of lexical 
representation which constitutes inputs in to the phonological components. Thus the 
postulation of syllabic structure in the lexicon makes it possible to achieve significant 


simplification of phonological component. 


The proposed research focus is to derive Punjabi phonological rules for applying 
these on lexicon element of PLS 1.0 and its XML codification. This organizing 
principle is expressed by placing all lexical phonological rules in the lexicon. 
Therefore PLS for Punjabi language within the W3C framework will be proposed 
which will be useful for the development of prosodic Punjabi TTS. 


1.2 Objectives of the Thesis 


Consistent specification of word pronunciation is critical to the success of many 

speech technology applications. Several guidelines have been reported to define the 

structure of a pronunciation lexicon, ranging from simple two-column ASCII 

lexicons. This gap has been bridged by the W3C PLS 1.0 Specifications which have 

been brought out as a broad specification for generation of pronunciation data in 

XML format for machine learning. This specification needs to be examined for its 

applicability for morphologically & phonetically richer Indian languages. The main 

objectives of the proposed research are as follows: 

i. Adaptation of the W3C PLS 1.0 for evolving a framework capturing Punjabi 
language phonological features. 

ii. Corroboration of the major linguistic aspects through analytical study of recorded 
speech signals for Punjabi Language. 

ii. Identification of the challenges for designing of web based Machine-Readable 
Pronunciation Lexicon Specification in XML. 

iv. Design of new lexeme elements and attributes to incorporate the identified 


features. 


1.3 Lexical Phonology 


Phonology deals with the abstract mental representation of sound rather than 
properties of the physical speech signal whereas morphology is concerned with the 
principles that regulate word structure in a language and how that structure relates to 
other components (e.g. syntax, phonology). The morphological structure of a 
complex word determines how the constituent morphemes of a word are realized 
phonetically. The phonological structure of a complex word reflects its 
morphological structure, but is not isomorphic to that structure. Thus morphological 
and phonological processes are tightly interrelated in speech production. During 
processing, morphological processes must combine the phonological content of 
individual morphemes to produce a phonological representation. Further, morpheme 
assembly frequently causes changes in a word’s phonological well-formedness that 
must be addressed by phonology, hence morphology & phonology are closely 
interrelated. Phonological structure of a language covers the inventories of 
phonological units (in common terms, inventories of vowels, consonants, syllables 
and tones , prosodic organization (in common terms, the organization of speech forms 
from lower to higher levels i.e. segments— syllables words— phonological 
phrases— intonational phrases) and relation of phonology with syntactic, semantic 


and pragmatic structures Pandey (2007). 


Mouth and ears 


Phonology 
(rules that define 
the sound pattern 

of a language) 


ye | S 


Lexicon Morphology Syntax 
(stored entries (rules for forming (rules for 
for words, complex words, forming phrases 


including regulars) 


Ne | Ra 


Semantics 
(meanings expressed 
through language) 


| 


Beliefs and desires 


including irregulars) and sentences) 


Fig 1/1: Morphology-Phonology Interface 


According to the lexicalist hypothesis as proposed by Chomsky (1970), all word 
formation, including inflection takes place in the lexicon. The theory of lexical 
phonology seeks to explain the inter-relationships between morphology and 
phonology by allocating some of the phonological processes to the dictionary or 
lexicon in which the morphemes reside. The domains of both morphological and 
phonological rules within the lexicon are subdivided into strata which define both the 
type of morphological processes applicable and the mode of operation of the 


associated phonological rules as defined in the form of structured framework. 


The phonological studies on Indian languages have been carried out in varied 
theoretical approaches. Majorly the framework of American structural linguistics has 
been used for this purpose. Such full length phonological studies can be seen in 
Kelkar (1968) which is based on a conception of language in which phonology 
manifests syntactic, semantic and pragmatic aspects of the linguistic knowledge of a 
sentence. Absolute phonological studies of Indian languages usually don’t 
exhaustively cover all aspects of phonological structure. These generally follow the 
usual divide between segmental and suprasegmental phenomenon. There is a general 


lack of good quality descriptions of the phonologies of Indian languages. 
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Tones in Indian languages have received stray treatments Haudricourt (1971) & 
Burling (1992). The Punjabi language studies by Gill & Gleason (1969), Dulai & 
Koul (1980), Gill & H.S (1986), Singh (2001) , Singh (1991) , Arun & Bhaskar 
(1997), Bhatia (1993) ete cover a majority of the Linguistics rules in Punjabi 
language. Punjabi is highly tonal Haudricourt (1971) and the tones arise as a 
reinterpretation of different consonant series in terms of pitch. The phonological 


studies on Punjabi need to be further investigated. 


1.4 Orthography of Punjabi 


Punjabi is a modern Indo-Aryan language spoken primarily in the Punjab states of 
both India and Pakistan. 
It is one of the Indic (Indo-Aryan) languages which gets distinguished from other 


languages of this family (other than Dogri) as it has developed tonal contrasts. 


Indo-European 
Indo-Iranian 
Ic 
Sanskrit (1 Too B.C-600 B.C) 
Pali rae 4 (600B.C-500 A.D) 


pa 2 (500 AD- 1000A D 


| | 


North- Western Apaphramshas other regional Apaphramshas 
Eastem 
Westem Classical Punjabi (10* century-16" century) 
Medieval Punjabi (16* century— 19* century) New Indo-Aryan 


Modem Punjabi (19% century onwards) 
Lahanda 


Fig 1/2: Family tree of Indo-Aryan Languages 
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A language usually refers to the spoken language, a method of communication. A 
script refers to a set of characters used to write one or more languages. Brahmi script 
is the oldest known writing system of Ancient India evolved in the beginning of the 
4th century BCE. Brahmi inscriptions are found on edicts of Ashoka in north-central 
India. Indic scripts are descendants of Brahmi script and are abugida. These use a 
system of diacritic marks to associate vowels with consonant symbols. Indic scripts 
are typified by Devanagari and have two important characteristics: conjuncts and an 
orthographic syllabic structure. Each Indic based orthography has a set of common 
conjuncts that are used, along with a possible further set of rarely used conjuncts. 
Conjuncts / ligatures representing consonant sequences for Gurmukhi are given in the 


following example: 


H+P+d= 


Punjabi uses either Gurmukhi script (written from left to right) or Shahmukhi (written 
from right to left) script, a Perso-arabic script. The phonology discussed herein refers 
to Gurmukhi script as it is used by majority speakers across the globe. According to 
2011 census of India, there are 27,704.236 Punjabi speakers in India; globally there 
are 120 million people who speak Punjabi. Punjabi is a tonal Language. The 
characters are normally aligned below the line of writing. Punjabi has concatenative 
morphology i.e. many words can be created using a root word and adding various 
morphemes. Punjabi language leans very heavily on the use of suffixes whereas use 


of prefixes is lesser. Word order is Subject Object Verb (SOV) and is fairly fixed. 


Gurmukhi has thirty five /péti/ alphabets as per old orthography. Vowels other than 
/9/ are indicated by accessory symbols (Vowel Matras) written around the consonant 


symbols. [@] U+0A73 & [¥] U+0A72 are vowel bearers and are not used as 


independent vowels. 
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The characters as per Unicode 11.0 are listed in the following table. 


S. | Characters Characters with Code-Point 
No. 
1 | Vowels =U+0A05, "T=U+0A06, f=U+0A07, H=U+0A08, 
(referred as 
; @=U+0A09, @=U+0A0A, 2=U+0A0F, =U+0A 10, 
primary - 
vowel) G=U+0A13, MEU+0A14 
Vowel 
Matras ST=U+0A3E, fo=U+0A 3F,A=U+0A40,2=U+0A4 1 ,=U+0A42, 
(referred as 3 : : 
3=U+0A47, O=U+0A48,3=U+0A4B,2=U+0A4C 
secondary 
symbols of 
the vowel) 
Z Consonants | H= U+0A38, H=U+0A36, J= U+0A39 , A= U+0A15, 
Y= U+0A16, H=U+0A59, T= U+0A17, SIHFU+0A5A, 
B= U+0A19, D= U+0AIA, B= U+0AIB, A= U+0AIC, 
=U+0A5B, = U+0A1E , C= U+0A1F, 5= U+0A20, 
3= U+0A21, = U+0A23, 3= U+0A24, A= U+0A25, 
T=U+0A26, S=U+0A28, U=U+0A2A, G=U+0A2B, S=U+0A5E, 
B= U+0A2C, H= U+0A2E, W= U+0A2F, 
J=U+0A30, B= U+0A32, B=U+0A33, = U+0A35, F=U+0A5C 
w= U+0A18, 8= U+0A1D, B= U+0A22, F= U+0A27, S= U+0A2D 
Tonemes 


Table 1/1: Vowels & Consonants with Unicode Code-Points 
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S. No. | Characters Characters with Code-Point 

1 Numerals 0=U+0A66, 1=U+0A67,2=U+0A68,3=U+0A69,¥=U+0A6A, 
U=U+0A6B,€=U+0A6C,9=U+0A6D,t=U+0A6E,t=U+0A6F 

2 Special =U+0A02,=U+0A 70,0=U+0A3C, 18 =U+0A74,2=U+0A 71 ,2=U+0A4D, 

Symbols j 

& =U+25CC,23=U+0A03,2=U+0A01 

3 Punctuation |=U+0964, Il =U+0965 

4 Conjuncts H+ &: + J-H=U+0A2E+0A4D+0A39 


Ut & + J A-U+0A2A40A4D+0A30 


Tt & + S -G=—Ut0A26+0A4D+0A35 


Table 1/2: Special Characters in Gurmukhi Script with Unicode Code-Points 


Sorting Rules in Punjabi 


Il. 


IL. 


VI. 


The sorting order starts with primary vowels RIN CRRCACR ROAR 


The words starting with vowels will then be combined along with the 


consonants in their alphabetical order 


CM EACGAY IY VEY 26C 0,3, C53 FC FOU IESE, 


F4FS7S,3 


The consonants combined with first primary vowels and will be arranged in 


their alphabetical order 


The consonants combined with secondary symbols of the vowels. 


The consonants combined with consonants according to alphabetical order i.e 


cluster formation. 


The consonants combined with the secondary symbols of the consonants. 


14 


Compounds 


Compound word can be formed from already existing words by a process known as 


compounding. For example: 


Simple Compounding - fAd+Ede /sir+dord/= fASEE /sirdord/ 


Hybridation Compounding - SH+“13" /basstodda/= SAMS /bosodda/ 


Reduplication Compounding - AB+H'™S /mel+mal/= ASH'S /melmal/ 


1.5 Phonological Features of Punjabi 


Phonology is concerned with how sounds function in relation to each other in a 
language. Punjabi literature reveals that the supra-segmental phonemes such as Tone, 
Nasalization and Stress are realized at the syllable level. There is abundance of 


geminated words in which stress co-occurs on the geminated consonant. 


Phonology 


a 


Segmental Phonology Supra-segmental Phonology 


—|I™~ 


Vowels Semi-vowels Consonants 


Intonation Nasalization Juncture 


Fig 1/3: Punjabi Phonology 
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Within phonology, two branches of study are usually recognized: segmental and 
supra-segmental. Segmental phonology deals with discrete segments, such as 
phonemes, supra-segmental phonology deals with those features which extend over 


more than one segment. 


Vowels: 

Oral vowels: There are 10 vowels in Punjabi i.e. 7 long vowels viz /a/, /1/, /u/, /e/, /e/, 
/o/ & /9/ and 3 short vowels viz /9/, /1/ & /v/ in Punjabi. Further these may be 
classified into two categories viz class I and class II vowels depending on their 
prominence. The class I vowels are phonetically less prominent and have laxer 


articulation then those of class II as discussed by Sharma (1971). 


Vowels (V) 
Class I vowels (V1) Class II vowels (V2) 
I u i u 
a e ° 
© (z) 5 
a 


Fig 1/4: Vowels Categories (monophthongs) 


This definition of V; and V2 will be used in subsequent chapters of this thesis. It is 
also noted that the use of short vowels in initial position of word and use of long 


vowels in final position of the word is more prevalent. 


Diphthongs: 
Word IPA Diphthongs (Va) 
yr /k*oa/ [al 
fer /\ta/ /ta / 
aet /goi / /9i/ 
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Nasal vowels: Correspondingly, there are 10 nasalized vowels 1.¢. viz /a/, /i/, /t/, /€/, 
/é/, /0/, /3/, //, /t/, /6/. 


Oral Vowel Nasal Vowel 
ae /katal/ are /kata/ 


Fig 1/5: Example of oral and nasal vowel 


The nasalization is phonemic and the opposition between nasal and oral is given a 
special technical status in the distinctive feature theory of phonology, where it works 


alongside other two-way contrasts as part of the complete specification of a sound 
system. 


Some of these features are discussed below: 


a) It helps in differenciating between grammatical forms e.g. 


Word IPA Form 
oat /beti/ Singular 
sar /beria/ Plural 

WJ /kdr/ Noun 

yg /kdr6/ Ablative case 
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b) When nasal consonants occur after the vowel in a word, the vowel is usually 


nasalized e.g. 


Word IPA Meaning 
ve /dén/ Gift/ Blessing 
WS /khan/ Mine 


c) In addition, if the word ends with an open vowel, this vowel also gets 


nasalized e.g. 


Word IPA Meaning 
ve /déna/ To give 
yet /kSana/ Food 


d) If dipthong or tripthong occurs at the end of the word with nasal vowel in the 


end, all the prior vowels also get nasalized e.g 


Word IPA Meaning 
Then /gia / Have gone 
nfm /Aia/ Have come 
wip / sia / To wait eagerly 


A similar phenomenon is observed in words containing € /v/, being semi vowel. 
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The consonant sounds of Punjabi, classified according to their place of articulation 


and manner of articulation are as below: 


Plosive(stop) | Bilabi | Labio- Velar | Uvu 
al dental lar 
(vluna) 
(vuna) 


Voiceless P (Cy) t®@ ai (aa 
a 
ffl al [akc 

aspirate Peri) 
he eta} —} “foe 

ne el hl 


ED 


A le 
os —— (®) 


Aiicicates: 
Unv oiced tf & 


Unvoiced tr © 

a ee ed 
BO ia 

moped Eo 


Table 1/3: Consonants IPA Chart 


1.5.1 Syllable: Sounds are grouped in larger units. The most important of these 
is the syllable. The syllable (referred as S) is a structural unit and within that structure 
we can identify a sequence of consonants and vowels. Just as in grammar we can 
parse a grammatical structure; in phonology we can parse syllabic structure. The 
syllable is the most basic element and it has psychological reality as a unit that 
speakers of a language can identify. Speakers are able to count the number of 
syllables in a word and can often tell where one syllable ends and the next begins 
phonetically. It is claimed that when identifying syllables, listeners are responding to 
sonority. Sonority is the relative loudness of a segment as compared with others. 
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Each syllable has a single sonority peak. What is a syllable? There is no definition of 
the syllable that phoneticians or philologists currently agree upon yet the notion of a 
unit at a higher level than that of the phoneme has existed since ancient times. 
Sonority or prominence: this is where some sounds are said to have greater 
prominence than others and these form the basis of syllables. Syllable boundaries fall 
at points of weak prominence. This is governed by a principle determing underlying 
syllable division known as maximal onset principle. It states that intervocalic 
consonants are maximally assigned to the onsets of syllables in conformity with 


universal and language- specific. 


The syllable is seen as a unit of neural programming rather than primarily muscular or 
acoustic events. If an error is made in the duration of a phoneme, the error is 
compensated for within the syllabic unit suggesting that articulatory events are 


programmed in terms of higher —level articulatory units rather than single phonemes. 


Every syllable consists of at least a nucleus (N), which is typically a vowel. The 
nucleus may be preceded by an onset (O), consisting of one or more consonants and 
followed by a coda (Co), again consisting of one or more consonants. The 
constituents are in general assumed to be hierarchically organized as consisting of 
Onset and Rhyme/ Rime and the Rime consisting of the nucleus and the coda, as 


represented below: 


Nucleus Coda 


Fig 1/6: Components of Syllable 


A syllable which ends in a vowel is called open and a syllable where the vowel is 
followed by one or more consonants is called closed. Here we describe some words 


from English & Punjabi language. 


Open syllable WG /dzao / (Punjabi) 


No /no/ (English) 


Closed syllable JAH /hakom/ (Punjabi) 


Odd /od/ (English) 


The syllable structure of Punjabi 
The canonical syllable structure is represented byPandey (2014): 
(C) (C) V (C) (C) 


The valid combinations are elaborated below: 

Monosyllabic Words: V, VC, CV, CVC 

Di-syllabic Words: VCV, CVCV, VCVC, CVV, CVCV, CVV 
Tri-syllabic Words: VCVCV, CVCVCV, CVCVCVC 


The frequency of disyllabic words is maximum in Punjabi however monosyllabic 
words are also found in abundance. There are plenty of trisyllabic words and few 


polysyllabic words. 


1.5.2 Tone: Most of the languages of the world which are tonal languages use 
tone in a systematic fashion to express either lexical or grammatical distinctions. 
There is no standard way in which tones are marked, either in conventional 


orthographies or in linguistic representations. 
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Most of the world’s tone languages in Africa and the America, have relatively 
modern spelling devised by missionaries or linguists, in which tone is usually marked 


by some kind of diacritic within an alphabetic writing system. 


According to Gill and Gleason (1969:48) “There is one tone onset on every word---- 
the occurrence of a tone may be taken to mark a phonologic word, generally 


equivalent to a morphologic word.” 


Punjabi is highly tonal Haudricourt (1971) and this is the contrastive feature of 
Punjabi (other than Dogri) among Indo-Aryan languages. Punjabi doesn’t have 
contour tones as are found in mandarin. There are five tonal characters and three 
types of tone i.e. high-tone /6/, low-tone /6/ and mid-tone /6/. Synchronically the tone 
placement interacts with accent/stress. In the production of tones there is neither 
friction nor stoppage of air in the mouth. These are pronounced always concurrently 
with a syllable. In the production of low-tone, there is a considerable amount of 
constriction in the larynx along with some creakiness. The fall in pitch is followed by 
a rise, not to the same level in all the cases. The pitch of the voice is raised and falls 
down in the same syllable in a monosyllabic word but in polysyllabic words the fall is 
realized on the tail syllable which follows the onset syllable. In mid-tone words, the 
pitch remains fairly level which may rise towards the end. The rise is not necessarily 


realized in all the cases. 


Joshi (1968: 48) defines pitch as, “ a sensation, perceived by the listener and 
referable to a scale, as well as being related to the frequency with which the vocal 
cords of speaker open and close during the utterance and which is measurable by 


instrumental techniques.” 


High frequency of the fundamental is related to high pitch and low frequency is 
related to low pitch. In Punjabi speech, as in other languages, it is the relation of the 
pitch of one syllable or word to another in the clause that is important and not the 


actual pitch. 
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1.5.3 Stress: Stress is the degree of prominence of a syllable and degree of 
force with which a syllable or a word is uttered. The usual distinction between 
stressed and unstressed syllables is, the former being more prominent than the latter 
(and marked in transcription with a raised vertical line [']). Stress systems can be 
divided into two types: metrical and prominence-driven. In prominence driven 
systems, syllables with high sonority nucleus i.e. long vowel, onsets or any of a no of 


other properties convey more stress. 


Pitch 
Frequency [Hz] 
Loudness Length 
Intensity [dB] Duration[sec] 


Fig 1/7: Phonetic correlates of Stress 


According to Krishnan (2003), stress placement in Punjabi has a three way syllable 
weight distinction as monomoraic light syllables (L), bimoraic heavy syllables (H) 
and trimoraic super-heavy syllables (S) which have a long vowel and a coda or a 
short vowel followed by two coda consonants. He also attested that tonal alternations 
have been observed viz the falling tone becoming a falling-rising tone in certain 
derived environments. The experimental study by Nara (2015) reported that Fo as 
well as duration is used as a marker of stress in tonal words. There is only primary 
stress in Punjabi. The syllable with the longest rhyme such as a long vowel, receives 


stress. 
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1.6 Pronunciation Lexicon 


1.6.1 Pronunciation Lexicon for Language Learning 


The lexicon is the bridge between a language and the knowledge expressed in that 
language Sowa (2005). Dictionary, a book or electronic resource that lists the words 
of a language (typically in alphabetical order) and gives their meaning, or equivalent 
words in a different language, often also provide information about Pronunciation, 
origin, and usage or a reference book on a particular subject. Readers use dictionary 
to learn the exact Pronunciation of a word, however it may have several 
Pronunciations. Words belong to different syntactic categories which determine the 
distribution i.e. the context in which they can occur. The types of dictionaries such as 
phonological, morphological, syntactical, semantic, etc. depend upon the phonology, 
morphology, meaning, etc. of the items. Root word is used to be given as the basic 
entry in a common purpose dictionary because it is impossible to put all inflectional 


or derivational forms. 


Printed Dictionaries: These can be termed as monolingual/bilingual/trilingual or 
multilingual dictionary. Monolingual Dictionary: Oxford Advanced Learner's 
Dictionary, COBUILD's Dictionary of English Language, Shabdkosh (Punjabi- 
Punjabi Dictionary) etc for understanding of the language usually contains 
information about parts of speech, irregular inflected form, definition of meaning in 


the same language, and often some pronunciation information etc. 


A bilingual dictionary is consulted for transforming into and understanding a second 
language. Bilingual dictionaries carry a list of translation and Pronunciation 
equivalents in its target language. Trilingual dictionary has one source language with 
Pronunciation and more than one target language. The terms in multiple languages 
are mapped taking one language as source language in a Multilingual Dictionary. It 


may contain more information inline with other dictionaries. 


e-Dictionaries: An electronic dictionary is a resource that contains a library of 
words and their meanings, spellings, Pronunciation and etymologies. It is used in 
background of other programs, such as word processors. 
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il. 


ili. 


Some dictionaries can also serve as thesaurus or translation tools, such as English- 
Hindi dictionary, etc. These days e-Readers, tablets, and smartphones also have e- 


dictionary capabilities. Some of these also have feature of recorded Pronunciation. 


Online Dictionaries: An online dictionary is a dictionary that is accessible via the 
Internet through a web browser. These may be in audio or video forms. Online 
dictionaries available in mostly Indian languages provide Pronunciation also e.g. 


http://dictionary.cambridge.org/ 


1.6.2 Punjabi Dictionaries with Pronunciation Feature 


Punjabi dictionary is available in the form of monolingual and bilingual (English & 


Punjabi). It provides the full information like lexicon, meaning, Pronunciation etc. 


Printed Dictionaries 

Punjabi-English dictionary written under the supervision of Joshi & Gill (Ed) 
(1994) contains about 40,000 Punjabi words, phrases and idioms. It also contains 
grammatical information and pronunciation of principal words in IPA. 
English-Punjabi dictionary by Singh & Sandhu (Ed) (1979) on the pattern of 
Webster’s Third New International Dictionary for arrangement of Lexical data 
with Pronunciation. 

Punjabi-English & English-Punjabi dictionary written by Goswami (2000) covers 


25,000 entries. It provides meaning, idioms, Pronunciation etc. 


e-Dictionaries 
The Punjabi e-dictionary is available for handheld devices. It provides the 


information of thesaurus, Pronunciation, translation, synonyms-antonyms, etc. 
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—_—- runjabi Dictionary 


Emelish YWord 


Punjabi Meanings 


2Soa SS. s~erS SS. SSeS. 
R=. Strasse. 


Symon wrmms 


dump, desert, end, lack of 
restraimt 


Ibo mywnns 


keep, Continues, restraimt 


Fig 1/8a: Punjabi e-dictionary 


= 
= Home a =S> <= = 

er > eo fo To | ey 
PREWIOUS SEARCH PRONOUNCE = a. CyreroR IT eE 


— 
Cc 7A 

SS ANTON YhMaS 

Bad, Ewil, Badness, 


ic 


cpiwei im coy 


Fig 1/8b: Punjabi e-dictionary 


Online Dictionaries: It is a dictionary which is available on the internet and can be 
accessed through a web browser or a mobile device, primarily by using the search 
facility. Most of these provide pronunciation also. Some of the websites offering 
these are <Dic.learnpunjabi.org>, <yourdictionary.com>, <Dictionary.refernce.com>, 
<Thefreedictionary.com> etc. <Tamilcube.com> provides online dictionaries from 


English to Multiple Indian Languages including Punjabi. 
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1.6.3 Machine Readable Pronunciation Lexicon 


Pronunciation Lexicon for use in Text- to -Speech systems: Pronunciation Lexicon 
for machine learning is required for development of speech systems as it represents 
the interface between acoustic and speech layer. For example in Text-to-Speech 
(TTS) synthesis, phonemic transcriptions are required for the selection of the proper 
units to generate the desired waveform. TTS systems are developed based on 
following approaches (a) corpus based concatenative synthesis which concatenate 
speech units (waveform) from a database known as_ unit selection synthesis Murthy 
etal (2013) (b) the statistical parametric synthesis, the source-filter model (Klatt 
Synthesizer) and (c) Statistical acoustic model viz HMM (Hidden Markup Model) 
based synthesis. 


Most TTS systems use a combination of pronunciation lexicon and rules. The TTS 
systems mainly use grapheme-to-phoneme rules as the main Pronunciation 
mechanism however these also provide Pronunciation lexicon for exceptional words 
for which rules aren’t applicable. The data size of such lexicon may be large. The 
TTS engine refers to the lexicon and generates the Pronunciation by rules if the word 


isn’t found in the lexicon. Sample data of such Lexicon for Punjabi is as below: 


Word IPA 
gfier /pumlka/ 
ay /pa/ 
fed /flt/ 
sre /padzdzor/ 
aot /bethi/ 
usher /paria/ 
3a /bodd/ 
ad /ni/ 
Ove /not{tfon/ 


Table 1/4: Sample Words with Pronunciation for TTS 
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Pronunciation Lexicon for Machine Learning: Syntactic word is considered as the 
smallest unit in a prosodic hierarchy tree. TTS requires large amount of pronunciation 
lexicon containing morpho-phonological information. It uses three different methods 


to learn rules specific to a language. 


i. Manually written rules 
ii. Probabilistic methods 


iii. Machine learning methods 


The Machine learning approach is most widely used these days. The prosodically rich 
PLS data can be used to develop language specific models to enhance the efficacy. 
Hence the sample data given in Table 1/4 needs to be augmented with supra- 


segmental information such as stress etc. may be useful for this purpose. 


1.7. W3C Pronunciation Lexicon Specification (PLS 1.0) 


PLS is designed to enable interoperable specification of Pronunciation information 
for both speech recognition and speech synthesis engines within voice browsing 
applications. It helps developers in supporting the accurate specification of 
Pronunciation information for international use through the use of language tag as 


provisioned. 


W3C have developed a recommendation of Pronunciation Lexicon Specification 
(PLS) and its current version is PLS 1.0 (2008) produced by Voice Browser Working 
Group of W3C. The specification covers the multiple Pronunciations and multiple 
orthography in the XML structure at the lexicon level thus providing the flexibility of 
creating language specific PLS documents. The Meta tags feature is available for 
describing the domain and end use. PLS specification provides a framework and 
guideline which can be tailored to the needs of a specific language and consequently 
the XML tag set can be defined to build the PLS data using IPA as UTF 8 


representation. 


28 


PLS can be used by Text to Speech (TTS) and Automatic Speech Recognition (ASR) 
Engines and can have a wide variety of applications like voice browsers, pedagogical 
tools etc. The Pronunciation Lexicon markup language enables consistent platform 
for independent control of Pronunciation for use by voice browsing applications. 
Thus this specification can be extended to all other human languages by examining 
the language-specific requirements. The Pronunciation Lexicon markup language 


consists of the following elements and attributes: 


Elements Attributes Description 
version 
xml:base 
<lexicon> xmlns root element for PLS 
xml:lang 
alphabet 
name 
<meta> http-equiv element containing meta data 
content 
Seas element containing meta data 
; the container element for a single lexical 
xml:id 
<lexeme> entry 
role 
contains orthographic information for 
<grapheme> a lexeme 
contains Pronunciation information for 
<phoneme> pet a lexeme 
P alphabet 
contains acronym expansions and 
<alias> Prefer orthographic substitutions 
contains an example of the usage for 
<example> a lexeme 


Table 1/5: XML Structure of W3C PLS 1.0 
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1.7.1 Review of Indian Efforts on PLS 


An initial work on development of Pronunciation Lexicon Mandal, Lata et al (2010) 
on Use of Part of Speech (POS) and morphological information for resolving 
Multiple Pronunciation in Pronunciation Lexicon Specification (PLS) for Indian 
Languages has been carried out. The work has been done using Bengali as a Case 
Study, which was presented in W3C Workshop on Conversational Applications, June 
2010, USA. Using example of Bengali word Wet /forlo/ (moved) and /forol/ (easy), 
the paper proposes to use the POS along with morphological information for 
resolving multiple pronunciations which will result in reducing the size of the 
lexicon. This can be used to choose the proper pronunciation among multiple 
pronunciations of the same orthography. Text-To-Speech (TTS) systems rely on 
lexicons, which contain pronunciation information for many words. PLS lexicons 
provide control over the text-to-speech (TTS) playback rendering on conforming 
reading systems. The proposed morphological features inside PLS lexicon makes 


voice of TTS more natural. 


1.7.2. Review of International Efforts on PLS 


SI-PRON 


In Slovenian language, occurrence of multiple orthographies is rare but multiple 
pronunciations are common. The lexical stress can be located on almost any syllable 
obeying hardly any rules. It contains all the lemmas from the dictionary of Standard 
Slovenian (SSKJ), the most frequent inflected word forms found in contemporary 
Slovenian texts. The lexicon file contains the orthography, corresponding 
pronunciations, lemmas and morphosyntactic descriptors of lexical entries in a format 
based on requirements. SI-PRON pronunciation lexicon developed over 1.4 million 
lexical entries. It contains a collection of over 190 context-sensitive and context-free 


grapheme-to-allophone rules. 
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It used “x-sampa-SI-reduced” phonetic alphabet, a subset of the X-SAMPA set as 
defined for Slovenian Gros et al (2006). 


Swedish Pronunciation Lexicon 


A Swedish Pronunciation Lexicon consisting of 8529 words for TTS/ASR has been 
developed. It has been developed based on PLS format, in addition the data has also 
been stored in a tab separated format. The delivery comes in two formats namely (a) a 
tab-separated format and (b) an XML format. It follows the SAMPA conventions. In 
the current version of Swedish lexicon, there are no special diphthong phoneme 
symbols. The tag-set used for part of speech information is similar to the one used in 
Stockholm corpus (SUC). The lexicon is lacking of two forms in the genitive 1.e. 


proper noun and adjective genitive forms. 


Finite State Pronunciation Lexicon for Turkish 


Similar work has been reported for Turkish, named as Finite State Pronunciation 
Lexicon which has approximately 7,50,000 words. The pronunciation is encoded 
using SAMPA. Turkish, being an agglutinating language with extremely productive 
inflectional and derivational morphology has an essentially infinite lexicon. Another 
important phonological feature of Turkish language is Stress. The system produces a 
parallel representation of the pronunciation and the morphological analysis of the 
word form so that morphological disambiguation can be used to disambiguate 
pronunciation. The computation of the position of the primary stress depends on 
interplay of any exceptional stress in root words and stress properties of certain 


morphemes and requires that a full morphological analysis be done Oflazer (2003). 
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1.7.3 Gaps in PLS 1.0 Specification 


The current version of PLS 1.0, the broad based base line specification which 
addresses the requirements of Latin script based languages only however cites few 
examples for Japanese and Chinese also. The requirements of many other global 


languages such as Indian languages haven’t been discussed. 


In Indian languages, grammatical information is relatively encoded in its morphology 
than syntax unlike English where the grammatical information is an integral part of 
syntax. The tonal language like Punjabi has concatenative inflectional morphology. 


Hence, PLS 1.0 needs to be revisited with respect to following: 


i.The provision to encode script information is currently not there and some 
languages use more than one script. 

i.It also needs to add some features, such as morphological & syntactic 
information associated with pronunciations. 


ii. It does not have provision encode borrowed words. 


The task of constructing a large pronunciation lexicon is very tedious and time- 
consuming, therefore there is a need to revisit current specification of PLS from 
perspective of Indian languages, specifically Punjabi and propose additional 
extension of PLS 1.0 to mainly deal with multiple pronunciations, descriptions of 
script, morpho-syntactic descriptions and other language specific features such as 


origin, script of the languge, POS tags, stem etc. 
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1.8 Research Methodology 


The Analysis using 
Speech Tools & 
Presentation of 
Experimented Data 


Collection of 
Data, Informants 
Word Data and Recording 


Sourcing from: Verification & 


Define Punjabi Corpus Validation of 


Research Existing Linguistic 
Problem Monolingual/ Rules 


Bilingual h : ; 
Dictionaries Phonetic — Augmentation of 
Transcription and Rules saad 


earpwe of Presentation of PLS 
Framework & 
sample PLS data 
based on new 
Framework 


Morphological 
Dictionaries 


Fig 1/9: Methodology of Research 
The present research involves collection of data, recording, data segregation, 


annotation, experimental study and analysis. 


1.8.1 Sources of Data & Data Collection 


(1) The data has been sourced from Punjabi corpus and published Punjabi 


Dictionaries. The criteria for selection of data can be broadly categorized into: 


1. The words containing five tonemes in the initial, medial and final syllable 


ii. The words containing consonant /h/ and conjuncts of /f/ i.e. J & J in the initial, 


medial and final syllable 


ili. | Non-tonal di-syllabic, tri-syllabic and some poly-syllabic words 
iv. Words containing Geminated consonants 
v. The words containing same vowel in different positions of the words 
Vi. The words containing schwa vowel 
Vii. The words containing nasalized vowels 
Viil. Sentences for study of release vowels 
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(2) Root words were selected from the dictionaries and its POS variations were 
obtained from online Punjabi Morphological Analyzer Tool for generating sample 


XML data. 


1.8.2 Informants and Recording 


Patterns of pitch variation are lexically significant in Punjabi hence are to be 
examined. The present study is on Malwai dialect of Punjabi language. Pronunciation 
Lexicon (PLS specification of W3C) being the scope of current research, the 
phonetically rich & frequently occurring words of Punjabi were collected for 
phonological analysis covering all phonemes, tonemes,consonant /h/ and conjuncts of 
/f/ and non-tonal words of Punjabi. The frequently used words cannot be used for 
study of prosodic features such as tone, stress etc. analysis and data will be 
specifically designed so that it fully represents the tone patterns. Word selection will 
be across monosyllabic, disyllabic, trisyllabic and polysyllabic for complete 


coverage. 


Ten informants (4 male and 6 female) between 25-40 age group belonging to the 
rural, town and city background were identified. These informants are from Malwai 
region of Punjab covering Bhawanipur, Kapurthala, Mansa, Patiala, Ludhiana etc. 
Recording of data will be done by these informants who are native speakers of 
Punjabi. The prosodic features are highly variable and depend on a complex set of 
factors, including speaker variables hence speakers were selected from across the 
Malwai region. What is high with regard to pitch for one speaker, may be low for 
another. Hence the averaging of observations over ten informants will facilitate fair 
investigation of the linguistic features. Representative data viz total of 4000 words 
across 10 speakers and 50 sentences across 8 speakers is to be used for prosodic 
study. Data will be recorded in the laboratory through good quality audio recording 
devices in standard speech and noise free environment having SNR>=45db as per 
standard procedure for speech corpora development based on the ITU 


recommendations. The informants to repeat each word of the word list thrice. 
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All recorded data will be segregated with the help of Goldwave Tool as it is a 
professional digital audio editor. The middle samples of the isolated words as 
recorded will be free from any contaminating contextual influences and will be used 
for investigations. All the segregated data will be used for the measurements of the 


pitch, intensity, duration, formants etc. of the recorded samples. 


1.8.3 Data Analysis and Presentation 


The annotation of the recorded speech will be carried out. The label "transcription" is 
used to refer to any symbolic representation of the significant side of documented 
speech events. Types of transcription are orthographic, phonemic and phonetic 
transcriptions of segmental information, transcription of prosody and of paralinguistic 
and non-linguistic phenomenon. The use of phonetic transcription is a faithful 
rendition of variation in pronunciation which may turn out to have relevance for the 
description of sociolects or dialects Gippert et al (2006). All the recorded speech data 
transcribed phonetically and will be tabulated to get the nature of varitaions in 
pronunciation. The annotation of the recorded speech data at phoneme level will be 
carried out using the PRAAT software package since it is a very flexible tool to do 
speech analysis. The values of pitch floor and pitch ceiling of 128-390 Hz will be 
used. This tool will also be used for analysis of the Fo contour and the slope of the 
contour over the pitch area of the associated vowel. The spectrographic analysis of all 
the male & female samples will be carried out. Data recording of the above given 
parameters will be done. Punjabi literature reveals that the supra-segmental phonemes 
such as Tone, Nasalization and Stress are realised at the syllable level, hence will 
require annotation at syllable level also. There is abundance of geminated words in 
which stress co-occurs on the geminated consonant, which will also be examined. For 
the analysis of the Punjabi tones, release vowel quality etc, fundamental frequency 
and formants of the associated vowel will be studied. MATrix LABoratory 
(MATLAB) algorithm will be developed to get mean pitch and duration. It is a high- 
level matrix/array language with control flow statements, functions, data structures, 


input/output, and object-oriented programming features. 


35 


Graphs will be plotted for sample words exhibiting pitch contour, duration, intensity 


and formants. The analysis will be presented on acoustic features of Punjabi. 


Pronunciation lexicon specification for Punjabi language within W3C framework will 


be proposed based on above proposed analysis. 


The parameters recorded for analysis will be following scientific methodology given 


below for corroboration of results: 


Acoustic ————————+ Auditory ————~ 


Phonological Category 


=" Fundamental 
frequency 

=" Formants 

= Duration 

« Intensity 


=" Pauses/silence 


" Pitch 

= Length 

= Loudness 
= Stress 

= Grouping 


* Voice quality 


= Tone 

"= Quantity (Vowel 
duration, Gemination) 

" Lexical stress 

"Levels in _ syllable 


hierarchy 


il. 


Table 1/6: Parameters for analysis 


The acoustic characteristics of spectrograms will be corroborated with auditory 


parameters and will be tabulated. This experimental data will be scientifically 


analysed for establishing the phonological parameters with references to PLS. 


1.8.4 Assumptions 


It is assumed that the work carried out will be, by and large, applicable to all 


other Indo-Aryan Languages except for the specific features of tone, 


gemination etc. specific to Punjabi. 


The research work will be carried out on the words recorded by native Punjabi 


speakers from Malwai region of Punjab. 
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ili. 


vi. 


Vil. 


Viil. 


Xi. 


Xi. 


Xill. 


It is assumed that the speakers are the representatives of the major Punjabi 
community of Malwai region of Punjab. 

The research findings will be reported based on the analysis of data recorded 
by 10 speakers (4 male & 6 female) and it is assumed that this can be 
extrapolated for reporting the research findings. 

The parameters selected for acoustic analysis is selected on the basis of review 
of International research efforts in this area. 

The syllable definitions vary from one source to the other as literature review. 
Therefore the syllable definitions of light syllable, heavy syllable and super 
heavy syllable will be defined for the current scope of research. 

For stress analysis, the complete coverage of di / tri / poly-syllabic words will 
be done on the basis of various combination of syllables as per above syllable 
definitions. 

The study of stress in disyllabic words to be reported on the basis of Linear 
Regression Analysis. 

The stress findings for the tri-syllabic & poly-syllabic words is extrapolated 
on the basis of the analysis carried out for di-syllabic words and also based on 
experimental work for small set of data. 

Phonological study of schwa vowel to be carried out to report variations in it’s 
behavior in different contexts and also as release vowel based on limited set of 
data. 

It is assumed that new PLS framework within W3C guidelines proposed based 
on the acoustic analysis of limited set of data will be largely applicable for 
Punjabi Language. 

The PLS data developed on the basis of proposed framework will be of 
immense benefit to TTS and ASR researchers for building Punjabi speech 
systems. 

Drawing examples from international efforts, it is assumed that computer 
scientists can further develop finite state machines for faster generation of 


PLS data based on the proposed framework. 
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1.9 Organization of the Thesis 


The Thesis is organized as follows. In Chapter 2 Literature Review on Tonogenesis 
of Punjabi will be discussed. The experimental verification and validation of tonal 
features of Punjabi will be reported in Chapter 3. Chapter 4 will focus on the lexicon 
stress and the stress resulting due to the presence of tone and gemination. Chapter 5 
will discuss the phonetic and phonological analysis of schwa vowel and also some 
other findings on release vowel. The morpho-syntactic features such as POS based 
lexical variations and other co-articulation features will be described in Chpater 6. 
The suprasegmented features discussed in the previous chapters will be presented in 
Chapter 7 as lexeme elements, attributes & rules for marking supra-segmental 
features. These features will be represented in the XML format for presenting the 
PLS framework (PLS 2.0) for Punjabi language within W3C framework. As per this, 
sample XML examples of Punjabi data are also given for reference. Chapter 8 will 
present the theoretical & practical work done alongwith research findings and path 


for future research. 
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Chapter 2 


Tonogenesis of Punjabi: Literature Review 


2. Introduction 

Tone is the use of pitch in a language to distinguish lexical or grammatical meaning — 
that is, to distinguish or to inflect words as corroborated by Pike & Welmers (1948 
&1959): 


“... having significant, contrastive, but relative pitch on each syllable” [Pike 1948:3] 


“... in which both pitch phonemes and segmental phonemes enter into the composition 


of at least some morphemes” [Welmers 1959:2] 


While Pike originally saw tone as a contrastive feature on each syllable or other tone- 
bearing unit (TBU), Welmers’ definition insists on the morphological nature of tone: 
tone is not a property of syllables, as expressed by Pike, but rather of morphemes, not 
all morphemes need to have a TBU- they may be “tonal morphemes”. Tone being 
supra-segmental in nature, the tone features as described below are ‘overlaid’ on 
segments and are not inherent to the definition of segments.The term tone language 
has traditionally been used to refer to those languages which use the feature of tone to 
distinguish between lexical items. A syllable is pronounced with different tones in 
order to differentiate meaning. Clark & Yallop (1990), in “Tone Languages’, tone is 
a feature of the lexicon being expressed as prescribed pitches for syllables or 
sequences of pitches for morphemes or words and in some cases, it may distinguish 
the meanings of words, thus tone is a significant part of a syllable. Linguists working 
within the Generative Phonology paradigm look for a set of features for 
characterizing tone and other prosodic phenomena of a language. Most tone 
languages have a number of rules that modify tones when spoken in a sequence i.e 


when spoken in normal phrases rather than in isolation. 
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Within the generative tradition, the study of word-prosodic typology was greatly 
influenced by McCawley (1968 &1970), who attempted to set up a principled 
distinction between tone vs. pitch-accent systems based both on distributional 
properties and rule types (tones tend to assimilate, accents tend to dissimilate or 
reduce). A survey of subsequent literature reveals that the terms “accent”, “pitch 
accent” and “tonal accent” have generally been used to refer to tone systems which 
are defective in the sense of restricting tones by number of contrasts or by position: 
“A pitch-accent system is one in which pitch is the primary correlate of prominence 


and there are significant constraints on the pitch patterns for words.” Bybee et al 


(1998:277). 


Tone exhibits long-distance effects within and across words i.e. the tone of one word 
migrates several syllables or words to its right. The word level tones are assigned by 
tule. 
Tone bearing unit (TBU) can be anyone of the following: 

a. The entire syllable (or the voiced part of it) 

b. The rime portion of the syllable (but not the onset portion) 

c. The mora (including the onset) 


d. The moraic segment (the segment in the rime) 


There is general consensus that in both tonal and non-tonal languages, the tone 
melodies that are present are best analyzed as consisting of sequence of one or more 
tones (generally called High/Mid/Low). In almost all cases, the rising and falling 
tones encountered on a single syllable (known generally as contour tones or dynamic 
tones) are best analyzed as being either allophonic variants of level tone, or more 
commonly as being the realization of a sequence of two level tones. It is difficult to 
draw a sharp boundary between tonal and non-tonal languages as described by 


Goldsmith (1994): 


a. A length of the span of each tone melody is roughly the size of a word in a 
tone language, where as in a non-tonal language, its size ranges between that 
of syntactic phrase and that of a sentence. 
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b. The tone melody of an utterance in a tone language is composed of the tone 
melodies that are directly contributed by the lexical items in the utterance and 
to slightly lesser extent by a syntactic constructions present in the sentence, 
whereas the tone melody of an utterance in a non-tone language is generally 


determined by the information structure of the sentence. 


c. Tone languages generally have phonological rules that modify the tone 
melody depending on the tones found around them as well as on the syntactic 


structure in which they occur. 


Tone systems are found in approximately 50% of the languages of the world. The 
greatest concentrations of “tone languages” are found in Sub-Saharan Africa, East 
and Southeast Asia, South central Mexico and parts of Amazonia and New Guniea. 
The study of tone has influenced the history of phonology and has contributed to the 
understanding of languages in general and in particular for study of syntax- 
phonology. Tone systems have properties which surpass segmental and metrical 


systems. 


Tone cannot be studied the same way as other phonological phenomenon. As in the 
case of voicing nasality vowel length and other phonological contrasts the normal 
technique is to first elicit individual words to determine the phonetic properties and 
ultimately the phonemic contrasts. In case of tone, it yields tonal minimal pairs and / 
or require specific contexts or “frames” in which the full range of contrasts can be 


discerned. 


Welmers (1959), describes discrete level tone system as one where the pitch value of 
the different tones are maintained in approximately as standard relationships to each 
other. He also introduced the notion of down step, which is the lowering process in 
tonal phonology which can be applied to the second of the two high tone syllables. 


This means that the choice of tone after a high tone syllable. 
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After low tone, the tone of the next syllable can only be low or high. After high tone, 
however, the next tone can be low, the high or down stepped high (that is a pitch 
slightly lower than the preceding high but not as low as it would need to be counted 
as a low tone). A high tone after down stepped high is on the same level as that down 
stepped high. A phonological feature called up step has also been discovered. Gill 
H.S. & Gleason (1969), deeply analyzed place of articulation & manner of 
articulation in the context of tones and concluded that tone system in Punjabi 


language is well developed & established. 


2.1 How to Measure Tone 


The melody of an utterance is communicated chiefly by movements in time of the 
pitch of the voice. Pitch as such is a perceptual concept. It is phonetic correlate of the 
vocal folds during the voicing of segments. Pitch changes can occur due to variations 
in laryngeal activity and can occur independent of stress change. They are associated 
with the rate of vibration of the vocal folds. Because each opening and closing of the 
vocal folds causes a peak of air pressure in the sound wave, we can estimate the pitch 
of a sound by observing the rate of occurrence of the peaks in the waveform. To be 
more exact, we can measure the frequency of the sound in this way. Frequency is a 
technical term for an acoustic property of a sound — namely, the number of complete 
repetitions (cycles) of variations in air pressure occurring in a second. The unit of 
frequency measurement is the hertz, usually abbreviated as Hz. If the vocal folds 
make 220 complete opening and closing movements in a second, we say that the 
frequency of the sound is 220 Hz. The pitch of a sound is an auditory property that 
enables a listener to place it on a scale going from low to high, without considering 
its acoustic properties. In practice, when a speech sound goes up in frequency, it also 
goes up in pitch. For the most part, at an introductory level of the subject, the pitch of 
a sound may be equated with its fundamental frequency Fo. Tone is observed 


through this change in pitch over an utterance. 
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According to Carnochan (1964) “Pitch is a sensation, perceived by listener and 
referable to a scale, as well as being related to the frequency with which the vocal 
cords of the speaker open and close during the utterance and which is measurable by 


instrumental techniques.” 


2.1.1 Methods and Apparatuses for Experimental Phonetics 
There are two methods which are used to study speech sounds: 


Direct Observational Method: In this method, the investigator relies upon his 
personal impressions and observations. He observes and listens to a subject in the act 
of speech and then tries to describe the physiological processes involved in the 
pronunciation of a particular speech sound. In this method, the degree of accuracy 
depends on the experience and training of the observer engaged in research. The 
literature survey reveals some accurate descriptions of the articulatory structures of 
speech sounds given by few phoneticians who have made use of this method. But 
now greater emphasis is laid on empirical evidence to verify and confirm the findings 


of the phoneticians. 


Instrumental Methods: These methods are preferred over observational methods as 
these eliminate the possibilities of subjective distortions which could be introduced 
by a phonetician. However, the method of observation has not lost its significance 
since experienced phoneticians still use it. It doesn’t exclude but presupposes 
instrumental methods. Thus speech should be investigated by combining both the 
techniques to get the best results. Instrumental methods may be divided into methods 
investigating articulation and methods of physical analysis of speech sounds, the 
nature of stress and intonation. The experimental work in this thesis will focus on 


physical analysis. 
2.1.1.1 Types of Instrumental Methods 


Recording the pitch and the intonation contour of spoken words and sentences has 


been focused by phonetic and linguistic research for a long time. 
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It is well recognized that a sufficient description is not possible by human hearing 
alone. Instead, experiments and measuring devices had to be developed for pitch 
analysis. “Pitch determination is one of the most important but also most delicate 
problems in speech analysis”. This statement from the standard book on “electronic 
means in this field” Hess (1983.3) describes a scientific problem which was known 
long time before the computer found its way into the phonetic laboratories, 
phoneticians become aware of the importance of pitch measurement approx. The 


choronology of various techniques is as under: 


a) Pneumatic Kymograph: This mechanism was utilized for examining the 
physical aspects of speech in the first laboratories of experimental phonetics. Air 
motions caused by the speech sounds were changed into mechanical vibrations of the 
stylus which left the traces of the recorded speech on the turning drum of the 


kymograph for example a kymogram as shown in figure below: 


19 


| 


— 


ae ay eee i 
Fig 2/1: Kymogram 

The investigations of pitch analysis using these devices offered a lot of problems and 

was time changing, hence it was replaced by an electronic kymograph registering 

speech wave and singling out the main acoustic parameters of speech-fundamental 


tone (melody). The time marker below the kymogram made it possible to calculate 


the duration of the speech signal 


t (msec) 


Fig 2/2: Electro-Kymogram 
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b) Intonograph: It is an electronic device which registers speech signal as a sound 
wave and signals out the main acoustic characteristics. The following main 
physical characteristics of speech can be separated and registered on the 


intonogram: 


(cycles 
per sec) 


Fig 2/3: Intonogram 


e Fundamental frequency (measured in cycles per second) is marked by a curve at 
the bottom of the intonogram. The higher the curve of the fundamental 
frequency rises, the higher is the fundamental frequency. The control signs of 
the fundamental frequency are situated at the upper line of the intonogram. 

e Intensity (measured in mm, conventional unit is db) is marked by a curve in the 
upper part of the intonogram. The lower the curve of intensity falls, the bigger 
is its meaning. 

e Time marker makes it possible to calculate the duration of the utterance or its 
parts, measured in msec. 

e The intonograph makes it possible to investigate intonation and stress as well as 


other phonetic phenomena. 


c) Spectrograph: It gives the opportunities to speech investigators for the study of 
physical characteristics of speech and acoustic method on the borders of sounds in 
speech etc in the form of a spectrogram, which has time along the horizontal axis, 
frequency along the vertical axis, and the amplitude of the signal at any given time 
is shown as a grey level. Conventionally, black is used to signal the most energy, 


while white is used to signal the least energy. 
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These are of two types: 

e Wide-band Spectrogram: A Spectrogram produced using an analysis scheme 
which emphasizes temporal changes in the signal: with short-time spectrum 
calculations (about 3ms) or highly damped analysis filters (about300hz). 

e Narrow-band Spectrogram: A spectrogram produced using an analysis scheme 
which emphasizes frequency changes in the signal: with long-time spectrum 


calculations (about 20ms) or lightly damped analysis filters (about 45hz). 


d) Kay Sonograph: It is a workstation for speech analysis, a powerful tool for 
speech-scientists or other speech professionals. It produces real time speech 
analysis on a high resolution display monitor. One-screen waveform editing and 
speech parameter extraction help to analyse speech and select segments for further 
work. Both narrow and wideband spectrographic analysis can be performed in real 


time. These analyses can be edited, stored and printed. 


[consonant vowel 

mw 4 Fill Ej transition formant 
= (NB (FM) (CF) 
3 => VOT 
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77 OK 
4 2 
is: fe F, 

6 er 


100 msec /division 


Fig 2/4: Kay sonogram 


e) Computer: It is an electronic device which can simultaneously acquire, store in 
memory, analyse and display speech signals and it also produces the required 
results from the stored data. Computer speech programmes provide all the 
possibilities for phonetic professionals. They are a powerful tool for acoustic 
analysis of all the phonetic phenomena of speech as it can combine the results of 


two main types of analysis-intonographic and spectrographic. 
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In the upper part of the computer intonogram, speech is recorded in the form of a 
sound wave. In the middle part of intonograms overall fundamental frequency in 
the form of a curve is recorded. The higher the curve rises, the higher the meaning 
of the fundamental frequency is. In the lower part of the intonogram, amplitude & 
the intensity of the speech signal is recorded. The bigger the intensity of the speech 


signal is, the higher the impulses of the intensity rise. 
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Total duration 0.596231 seconds 


>| 1 Group 


Fig 2/5: Computer Intonogram 


Praat: It is a software tool using which one can study the acoustic characteristics of a 
sound file by viewing and measuring the sound files waveform and spectrogram. 
Pitch range settings in PRAAT are the most important parameters used in pitch 
analysis. As described, the pitch floor determines the window length and the pitch 
ceiling restricts the values being used during the analysis. The optimal default values 
of pitch floor and pitch ceiling are 75-500 Hz. This tool can be used for acoustic 
analysis by documenting various parameters of the sound waveform such as value 


and slope of Fo, Formants, Intensity, Duration etc. 


Gold Wave: It is a professional digital audio editor that plays, records, edits, 
processes and converts audio on the computer. Gold Wave includes a complete set of 


audio processing features. 
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An intuitive and customizable user interface makes editing easy. An independent 
Control window provides direct access to audio devices. It contains controls for 
playback, rewind & fast forward, recording, volume, balance and speed. Real-time 
visuals display the sound during playback and recording. A multiple document 
interface (MDI) allows several files to be opened at one time, simplifying file-to-file 


editing. 


Matlab (matrix laboratory): It is a fourth-generation high-level programming 
language and interactive environment for numerical computation, visualization and 
programming. It has powerful built-in routines that enable a very wide variety of 
computations. It also has easy to use graphic commands that make the visualization 
of results immediately available. Specific applications are collected in packages 


referred to as toolbox. 


22 The Analysis of Pitch Patterns in Tone Systems 

Tone is a linguistic term that refers to a phonological category that distinguishes two 
words or utterances and is only relevant for languages in which pitch plays some sort 
of linguistic role. It is established through research studies that the vibrations of the 
vocal cord result in change of pitch and this change of pitch is used to distinguish 
words. The change of tone results in distinctive word formation. Tone in the 
linguistic domain gets mapped to Fo in phonetic domain. Fo is an acoustic term 
referring to the speech signal of the lexical items and reflects how many pulses per 
second are contained in the signal. The perception of tone must be dependent in 
whole or in part on pitch perception, and thus on fundamental frequency. The speech 
signal must contain large enough Fo fluctuations, to be perceptible as pitch 
differences. Therefore tone is an inherent expression of pitch that contrasts with 
other expressions of pitch. Tone is neither pitch variation within a defined perceptual 
space nor a system of pitches expressed relative to a single segment in segmentally 
based minimal pairs and it is phonetically analyzed relative to other Fo segments that 


are in sequence with rather than looking at it as a segmental attachment. 
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The various features of any tonal system can be broadly categorized as: 


General Criteria Specifies 
a. Number of level tones e At least four, possibly five 
b. Contour tones e Rising, falling, convex, concave 


e Sometimes result of combining two 


or more levels 


c. Contour tones contrasts e Two or three 


d. Common alternations e Assimilation & dissimilation, 
simplification and formation of 


contour 


e. Tonal markedness e In a two level tone system, low is 
usually unmarked. 

e Ina three level tone system, mid is 
usually unmarked 

e Level tones are less marked than 


contours. 


f. Tonal and laryngeal features e Low tone associated with voicing, 
and especially 
e High tone associated with 


voicelessness 


Table 2/1: Features of Tonal System 


Thus the acoustic properties of speech signal relate to the phonological information 
which the signal conveys. The vibration of vocal folds is periodic and is known as 
phonation. Several aspects of phonation waveform combined together result in the 
spectrum. The slope of the spectrum represents voice quality i.e. rate of airflow 
during phonatory cycle. All native language speakers exhibit variation in duration & 
amplitude from cycle to cycle phonation. The phonation waveforms and spectra 


represent idealizations of natural speech; hence can be used for phonological studies. 
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The pitch of the speech signal is the perceptual correlate of frequency, the higher the 
frequency, the higher is the pitch. The pitch contours can be studied using 
spectrogram of speech signal of native speakers using s/w tools as described in 
section 2.1.1.le. Thus Fo contours of recorded words can be analyticaly examined to 


study the tonal characteristics of a language. 


The phonetic facts for publishing linguistic data on tones are plots of fundamental 
frequency over time. There has been a concensus among various linguistic theories 
that tone is always transcribed on the syllable nucleus, which is usually a vowel. 
Thus tone may be phonetically realized on any voiced sonorant segment in the 


syllable. 


2.2.1 Types of Tones and Notations 


Tone is primarily the contrastive use of pitch in grammar and lexicon, including 
movement from level to level. The first question is what are the fundamental pitch 
levels? The simplest systems have a two-way contrast between higher and lower 
pitch, H and L. In a tone language, distinctive pitch levels and contours along with 
vowels and consonants serve to make up a word. Such languages vary as to how 
many phonologically relevant tones they have. In contour-tone languages, at least 
some of the tones must be described in terms of pitch movements such as rises and 
falls or more complex movements such as rise-falls. This is characterized by many 
tone languages of south-east Asia. The nature of tones can thus be broadly 


categorized as: 


2.2.2 Register Tones 


Register-tone languages use tones that are level i.e. they have relatively steady-state 
pitches which differ with regard to being relatively higher or lower. This is 
characterized in many tone languages in West Africa. Register Tones are small no. of 
tones which are illustrated over vowels e.g. a, a and 4 e.g. high, low and mid (level) 


tones. These symbols don’t give an impression of the pitch movement. 
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These symbols are further combined to get combinations of high and low i.e. falling 
(high + low) etc. Gur, Atlantic Mande, Dogon, Nilo-Saharan, Chadic and Cushtic 
languages usually have two level tones. Examples of 3-level languages are Angas, 
Peki Ewe, Ebira, Kasem, Kotoko, Kpelle, Logo, Mbay, Yoruba and Ibibio. Kotoko 
has the 3-tone system H M L. The representation of Register Tones is illustrated 


below: 


€ “d Top 

é a” Hish 

e@ d mid 

Co | Low 

oC I Bottom 

Tone terracing 
* Upstep 
bs Downstep 


Fig 2/6: Register Tone Levels 


In some languages (Shonna, Kipare, Mbololo Taita, Miya), syllables are either H or 


L, without phonological rising tones, which involve Fp movement from level to level. 


2.2.3. Contour Tones 


Contour tones are clusters of level tones which have been widely adopted by African 
phonologists, but it has met considerable scepticism from Chinese phonologists e.g. 
Yip (1989), Bao (1990), Cahn (1991). Contour tones pose two problems for 
distinctive feature theory. First, if contour tones are basic units, they require 


trajectory features such as rise and fall, or a modified version of it, as shown: 


Rise TBU Fall TBU 
L H H L 


Fig 2/7: Model of Contour Tone Units 
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Many languages have phonological contour tones. Some allow contour tones only on 
long syllables, for example Hausa and many Bantu languages (Tachoni, Dembwa 
Taita) have just falling tone and only on long syllables. Many languages have 
contours on short vowels, thus Gen and Temne have H, L, Rising and Falling tones; 
Angas has 3 tone levels and the 4 rising and falling contours which do not end with 
Mid tone; Benchnon Wedekind (1983) has 5 levels but only one contour, a 4-3 rising 
tone. These languages lack long syllables. 


Languages with four tone levels are much less common and include Bariba, Anlo 
Ewe, Grebo, Igede, Kamba and wobe. Five levels are quite rare, occurring in 
Benchnon and the Santa dialect of Dsan and only Chori is reported to have six. The 
Santa dialect of Dan Bearth & Zemp (1967), Filk (1997) which has 5 levels and 
contrastive length, allows one short contour (2-3 fall) but 5 long contours (rises 3-2,3- 
1 and falls 1-5,2-5,3-5), way fewer that the 20 possible contours. The representation 


of Contour Tones is illustrated below: 


e A Rising 

G N Falling 

= 4 Hish rising 
e A Low rising 
eG N Hish falling 
e N Low falling 
e 74 Peaking 

e 4 Dipping 


Fig 2/8: Contour Tone Levels 


These pitch movements are represented on a 5-point scale (1= lowest & 5 = highest) 
by means of tone letters consisting of a vertical reference line on the right preceded 
by a line indicating pitch. Often the tone is also explicitly described by a series of 
numbers on the 5-point scale. It is basically a stylized representation however lacks in 


details of actual pitch contours. 
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Tone Sandhi is easy to represent using tone numbers. Mandarin, Cantonese and Thai 
from Asian Region belong to this category. The major characteristics of Yip (2001) 
and Barrie’s (2007) proposal for contour tones in Chinese languages are as follows. 


First, as is generally assumed for Chinese, contour tones are unitary entities, with 


only one tonal root node. Second, only one register feature [+upper] is specified for 


the whole contour tone. Third, only the tonal onset, but not tonal offset, is specified 


for the pitch feature [+raised]. That is, this is a one-target proposal, with only the 
tonal onset explicitly and fully specified cf. the two- target unitary-entity proposal in 
Yip (1989). Forth, a [contour] feature (Barrie) or an unspecified “rebound” (Yip) 
signals a contour tone. All these properties are illustrated by the following examples, 


based on Barrie’s system. 


S. No. Tone Onset Offset 
[tupper] | [+raised] | [tupper] | [+raised] 
1. High —Level 55 + + + + 
pa Mid-Level 33 + 7 + 3 
3. Low- Level 22 - + < + 
4. High —Rising 25 2 + + + 
5. Low- Rising 23 - + + c 
6. Low- Falling 21 - + a 2 


Table 2/2: Chinese (Cantonese) Tone Levels 


2.2.4 Standard Notation in IPA 
The IPA consists of a universal set of symbols representing distinctive sound of the 
world’s languages and is used to show pronunciation in many dictionaries 


(International Phonetic Association 1999). 
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The IPA chart consists of several sections such as vowels, Pulmonic consonants and 
non- pulmonic consonants. The IPA chart can be a useful tool for teaching the basics 
of speech production, as it shows at a glance commonalities and differences between 


the articulations of various speech sounds. 


Different notations were being followed in Asia, Africa and America etc. to denote 
the tone thus IPA was devised by Henry Sweet (1889), to standardize this notation of 
various diacritics applied over segmental representations. Most of the world tonal 
languages have 5 levels of pitch heights which have been provisioned in IPA Chart 
for transcription. However, there are few exceptions such as African Languages 
(Chori, Benchnon etc.), Asian languages viz. Chao etc. IPA provides diacritics for 
(upstep) (ff) and downstep ( J1) to facilitate representation of desired no. of tonal 


heights. The various tone contours are also provisioned as below: 


Table 2/3: Register & Contour Tone Levels in IPA 


These IPA notations will be used for data representation in this Thesis. 
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2.3 Function of Tones 


Phonological theories are sharply divided into two areas: segmental and prosodic. 
Segmental phonology focuses on “melody”: speech sounds (segments), their internal 


composition and external interactions. 


One of the greatest discoveries by Trubetzkoy (1939) & Jakobson (1941) in this area 
is that segments consist of features and it is through these that segments interact with 
each other. Segmental phonology is therefore concerned with phonological features, 
how are they organized inside segments and between segments. Prosodic phonology 
focuses on aspects of the sound system “above” the level of segments, such as timing, 
tone, stress and rhythm. Research into the nature and patterning of these phenomena 
suggests that speech sounds are not just arranged linearly, but are hierarchically 
organized into prosodic structure: segments into moras and syllables, syllables into 
metrical feet, metrical feet into prosodic words, prosodic words into phonological 


phrases, and so on. The prosodic structure is as given below: 


® <—phonological 


ee phrase 
< phonological 


W W 
Oe | words 
Ps Po A < metrical feet 
0 0 0 0 6 («CO 0 <«<syllables 


m bu p/w mors 
li ot VF V 
k + @ i | 


<— segments 


Fig 2/9: Segmental Phonology 


A tone system has lexical, morphological and syntactic functions. Tone systems have 
properties which surpass segmental and metrical systems. This is especially true of 
the long-distance effects that tone exhibits both within and across words, as when the 
tone of one word migrates several syllables or words to its right. Some tonal 
phenomena have no segmental or stress analogues, thus there is a need to understand 


how tone systems work. 
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Thus the role of Segmental phonology is not limited to only syllable structure and 
the distribution of the consonant and vowel phonemes but also covers the tones and 
tone sandhi, leading to a tone system viz. a system of six to eight contrastive tones at 
the lexical level. The functions of tone include the restrictions on the lexical tone 
system according to the part of speech and Tonal sandhi viz. the tones mark signifies 
grammatical contrasts in addition to lexical units which is a cue used in syntactic and 


discourse structure. 


Based on this, tonal languages of the globe can be divided into two categories i.e. 
Asian Tone languages in which tone is primarily limited to lexical function (Type A 
languages). African and Central American tone languages in which the tone spreads 
to neighboring syllables and exhibits segmental morphology and _ have 
polysyllabic roots (Type B languages). However there are some languages like 
Japanese which don’t fall under any of these categories as every word in Japanese has 


a fixed tone pattern. 


2.3.1. Lexical Level Tone Function 


The role of tone is limited primarily to lexical function & it does not exhibit at 
morphology level and thus do not have lexical contrast. These languages have more 
phonemic tone and tone sandhi rules involve predictable replacement of one tone for 
another rather than spreading of a tone onto neighboring syllables. There is no use of 
segmental morphology however syntactically defined tone sandhi compounds may be 
present. Minor (closed) word classes marked by tone may differentiate lexical 
meaning. Predominantly monosyllabic roots are found in such languages. 
Phonological word-building resources are determined by non-tonal contrasts however 
a set of vocabulary is governed by tone function. Most Asian tone languages belong 
to this category except for Hakha Chin or Lai language which has exceptional 
features of type B although it is spoken in some parts of Asia viz. Mizoram in eastern 
India & Burma, small number of speakers in Bangladesh. Tone is _ lexically 


contrastive in Japanese. Punjabi can be clearly categorized as Type A language. 
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2.3.2. Morphological Level Tone Function 


These languages (Type B) exhibit all types of Tone functions as discussed for Type A 
languages however in addition to that, these also make major use of tone in 
morphological processes such as tonal derivation, inflection etc. Polysyllabic roots 
are found and these languages exhibit derivational and inflection segmental 
morphology. Thus Major (open) word-classes are characterized by different tone 
inventories or alternation of tone patterns. Number of possible syllables X and their 
syllable position within the non-tonal words is comparatively high. Tone Sandhi is 


Syntagmatic. 


Tone Sandhi is governed by a number of rules that modify tones when spoken in a 
sequence, i.e when spoken in normal phrases rather than in isolation. One of the most 
well-known cases is Mandarin Chinese wherein two Tone-3 syllables occur in 
sequence, the first one is changed to Tone 2 as explained in the examples given 


below: 


mai hau chou chi shuei guo wo hen ho 
33223 13223 s3223 
buy good wine eat water fruit I very good 


Each word consists of 3 syllables. They are spoken first as isolated syllables (without 
sandhi) and then as a phrase (with sandhi). The tone of the middle syllable changes in 
each case from Tone 3 to Tone 2 (indicated by “3>2"). 


Most of the Languages of African and Central American region belong to this 


category. Word-building in type B languages uses tonal morphology. 
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2.4 Study of Tone in Different Language Families 


A language family is a group of languages that are related to their descendents from a 
common ancestor. All natural languages of the world have historical base. The 
boundary of linguistic ancestry is always not clear as the languages come into contact 
with each other due to conquest or trade or through other means and they tend to 
borrow the features from the languages with which they do not have any historical 


relationship. 


The common ancestor of a language family can be identified by the Comparative 
Linguistics which studies the historical and genetic relationship between languages. 
The regularity of sound change is the pre-requisite for the comparative method. It 
implies that when a certain sound X changes in one word, the same change X tends to 
take place in all words where sound X occurs, or in all words where sound X occurs 
in a particular context e.g. sound cluster from / kt / latin undergoes a change as 


below: 


e Latin /kt/ > Portugese / jt/ 
e Latin /kt/ > Spanish /tf/ 
e Latin /kt/ > Italian /tt/ 
e Latin /kt/ 2 Romanian /pt/ 


The branching structure of a family free is based upon shared changes. These changes 
distinguish the group from related languages. Suffix -ic is used to designate 
languages families and major groups such as Turkic whereas Turkish is a language. 

Languages are often characterized as tonal or non-tonal. Tonal languages utilize pitch 
to distinguish lexical items, whereas non-tonal languages do not use pitch 
distinctively. Tonal languages are further divided into tone languages and pitch- 
accent languages. In tone languages, the tone of each syllable is unpredictable and, 
therefore, must be specified in the lexicon. No syllable in tone languages is 


considered more prominent than any other. 
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In pitch-accent languages, by contrast, the specification of some accent location is 
sufficient to predict the tonal configuration, or melody, of the entire word. Therefore, 
the syllable on which such an accent falls is considered more prominent than other 
syllables. It can also be said that moving from one tone to the next in tone languages 
is a syllable- level phenomenon, whereas such a movement in pitch-accent languages 


is a word-level phenomenon. 


Japanese and Korean languages aren’t specifically covered under any language 


family hence are being discussed here. 


In standard Japanese, the only distinctive melodic characteristic of a phrase is the 
location of the syllable, if any, where the pitch drops. The tone pattern of a Japanese 
word is predictable as can be seen from following example where syllables are 
separated by a hyphen, where H is a high tone and L is a low tone as seen in the 
following example: 

ka-ki-ga -H-L-L ‘oyster’ 

ka-ki-ga -L-H-L ‘fence’ 

ka-ki-ga -L-H-H ‘persimmon’ 


Thus, for a given word form, there are only as many possible tonal patterns as there 
are syllables (ignoring unaccented word). Thus a tri-syllabic word has three possible 
tone patterns. Accent, unlike stress, may not necessarily be accompanied by greater 
duration or amplitude. Apart from its effect on pitch, accent is hardly felt by native 
Japanese speakers. Pitch can be predicted from accent marks as follows: the pitch is 
high up to the first mora of the accented syllable (or up to the end of the phrase, if is 


unaccented, its first mora is low pitched). 


Korean language made use of tones until late 16" century. It contained a system of 
denoting the four tones by placing one or two dots on the left of the letter. Until 
around 20" century, it was common for Koreans to distinguish certain words by 
pronucing them for a little longer. These vestiges of tone are today unnoticed even by 


Koreans themselves. 
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Depending on the morphological category of the morpheme, its dictionary entry will 
specify either the syllable, if any, on which it contributes an accent (nouns, 
postpositions, verb inflections) or merely whether or not it contributes an accent 
(verbs, adjectives). The rules apply in such a way as to yield outputs in which each 
phrase has at most one accent. Some accent rules make one accent predominate over 


others whereas others attract accent into a given position. 


2.4.1 Niger-Congo Languages 


It is largest language family of world and has 1436 languages. It covers mainly the 
different types of African languages. Many of these languages have phonological 
contour tone which is exhibited on long syllable. In some languages short syllable 


only have level tone and other have contour on short vowel. The main branches are: 


Niger- Congo 


Wrestern Sudanic Benne- Congo 
ee 
7 
Mande westatlantic Gur Rwa Bantu Delta cross 
“~ 
1 Yoruba 2 Ikhin(E do) aiIbibio 


Fig 2/10: Niger-Congo Languages 


2.4.1.1 Yoruba Language (Register) has three phonemically distinctive tones-H, M, 
and L. H occurs in word-initial position only in marked consonant-initial words, 
which reveal an implicit initial vowel when preceded by another word in genitive 
construction. Most words start with a vowel, which is L or M but not H. Except for 
this minor tonotactic restriction, tones occur freely in lexical representations, without 
apparent restrictions on word melodies. 

So there are three possible tonal patterns for monosyllables, nine possible tonal 
patterns for disyllables. Lexical tone contrast in such words is indicated in the 


following example: 
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S. No. Word-1 H Word -2 Word -3 
M L 
ra ra ra 
1 “to disappear” “to rub” “to buy” 
Pako Kése Pako 
2 = “plank” “mythological “chewing 
= place name” stick” 
Oko Oko Oko 
3 “hoe” “husband” “vehicle” 
ta Tu Tu 
4 “town” “opener” “drum” 


Table 2/4: Tone levels in Yoruba 


2.4.1.2 Ikhin (Edo) Language (Register) is spoken in Ikhin, Edo, Nigeria. Ikhin has 
terraced level tone system having two basic tones viz. high and low. The following 


minimal pairs of words get differentiated only by tonal contrast: 


S. No. Word 1 HL Word 2 LL 
1 Aki aki 
“Toad” “Market” 
2 dkpa dkpa 
“Cock” “One” 
3 éda eda 
“High” one“River” 


Table 2/5: Tone levels in Ikhin 


2.4.1.3 Ibibio Language (Register) has three tones (high, low and falling). The 
falling tone only occurs on final syllables, giving the following combinations in two- 


syllable words: 


61 


Tone on First Syllable 

Word 1H Word 2L 
vo 
3 H akpa akpa 
BR | “expanse of ocean” | “ first” 
3 
5 F akpan akps 
— | “square woven basket” | “ rubber tree” 
i) 
= L aku akpa 
i 73 : 99 

| “priest | (small ant) 


Table 2/6: Tone levels in Ibibo 


2.4.2 Austric Languages 


The Austric proto-language has been identified by some with the Hoabinhian 
archaeologicali industry dating from the late Pleistocene to mid-Holocene (roughly 
6,000 to 12,000 years ago). Primary Hoabinhian sites have been identified in 
Sumatra, Thailand, Laos, Myanmar and Cambodia, while isolated inventories of 
stone artefacts displaying Hoabinhian elements have been found in Nepal, South 
China, Taiwan and Australia. Except for Nepal and Australia all of these areas are 
home to Austric languages and there is evidence that Austric may formerly have been 


spoken in the Himalayan foothills also. 


Austric 


1 
{ t 


Austro-Tai Austro- Asiatic 


1 Thai 2 Lao 
3Vietnamese Munda 


Fig 2/11: Austric Languages 


2.4.2.1 Thai (Contour) Language is a tonal language. Tones are the core of the 
language, they are essential, as important as any vowel or any consonant. Tones 
distinguish the meaning of one word from another. 
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Each syllable is pronounced with one of five distinct tones- middle, low, falling, high 
or rising. The middle tone starts at a middle pitch level, rises slightly and returns to 
mid-level. The low tone starts low and gradually falls even lower. The falling tone 


starts high and falls to a low pitch. The high tone rises to a peak and then drops. The 


rising tone starts at mid-level and gradually rises. 
S Word 1 Word 2 Word 3 Word 4 Word 5 
No M L F H R 

1 mai Mai mai mai mai 
“mile” “new” “not” “wood” “no?” 

2, kha: kha: kha: kha: kha: 

“a glass” | “galangal” “slave” “to engage “leg” 

in trade” 


Table 2/7: Tone levels in Thai 


2.4.2.2 Lao (Contour) Language is an isolating tone language where most syllables 
form individual morphemes. There is only eight bound derivational morphemes 


Enfield (2007). Tone varies significantly depending on the Lao dialect; Lao linguists 


identified five tones on long and three tones on short vowels. 


S. Word 1 Word 2 Word 3 Word 4 Word 5 
No. L M F H R 
1. kha: kha: kha: kha: kha: 
“slave” “galangal” | “commerce” “stuck” “leg” 


Table 2/8: Tone levels in Lao 


2.4.2.3. Vietnamese (Contour) Language is the official language of Vietnam. 
Vietnamese is based on melodious syllables and stressed accent. It is a monosyllabic 
language with each articulated sound carrying a certain meaning. There are five types 


of tones and a mid-level non-tone. 
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Word 1 | Word 2 Word 3 Word 4 Word 5 Word 6 
L HR M Dipping R LF L F Short H R Glottal 
stop 
ma Ma ma ma ma ma 
“ghost” | “cheek” “tomb” “but” “young rice” “horse” 


Table 2/9: Tone levels in Vietnamese 


Tones are realized by a complex of pitch and voice quality features. In particular, 
glottalization plays an important role in the production and perception of the broken 
and glottalized tones. The falling tones have been described by some researchers as 
accompanied by a breathy voice quality. The low falling tone has also been described 


as accompanied by light final laryngealization. 


Vietnamese tones are not subject to phonological tone sandhi (i.e. the realization of a 
tone is not affected by the surrounding tonal environment), tonal realization in 


connected speech is subject to phonetic coarticulation effects. 


2.4.3 Indo-European Languages 


It is one of the largest language families in the world havingten branches of living 
languages. Out of these, three are primarily spoken in India i.e Armenain, Iranian and 
Indo-aryan (Indic). The most widely spoken Indo-European languages are Spanish, 
English, Hindustani, Portuguese, Bengali, Russian and Punjabi (over 100 million 
speakers each). The next widely spoken languages are German, French and Persian. 
Germanic languages possess a number of defining features compared with other 


Indo-European languages. 


Indo-European 


ols Norwegian — — Tranian 


2 


Fig 2/12: Indo-European Languages Families 
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2.4.3.1 Swedish Language (Register) is a pitch accent language which has two 
distinctive accents related to the different syllabic structures. Acute and grave accents 
often distinguish meaning. Monosyllabic words and words with the stress on the last 
syllable receive the acute accent. It can occur in any accented syllable regardless of 
position. 

Acute Accent (accent1): 

1. Monosyllabic words including their declination, e.g. /huset/ 

2. Words which start with an unstressed syllable, e.g. / batala / 

Grave Accent (accent2): 

It never occurs in the last syllable of a word. Therefore it occurs only in polysyllabic 


or at least dissyllabic words. 


S. Word 1 Word 2 
No. Acute accent H Grave accent L 
1 Slutet slutet 
“the end” “Close perf. part of att sluta” 
2 Vaken Vaken 
“the ice hole” “awake” 
3 Skallen skallen 
“the brak” “the skull” 
4 Egen égen 
“own” “peculiar” 


Table 2/10: Tone levels in Swedish 


2.4.3.2 Latvian Language (Register) is a Baltic language, hence it exhibits syllable 
tones (also called syllable accents or syllable intonations). There are three types of 
tones viz. level, falling, and broken tones (B) which are associated with a syllable 
having a long vowel, diphthong or a combination of a short vowel plus sonorant (so- 


called diphthongal sequences) respectively. 
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S. No. Word 1 Word 2 Word 3 
L F B 
1 mit mit mit 
“change” “exist” “tread” 
2 atiksts - atiksts 
“cold” “high” 
3 rauks rauks 
- “pucker” “veast” 
4 valks - valks 
“tether” “humid” 


Table 2/11: Tone levels in Latvian 


For syllable tones, an obstruent occurring after a short vowel has no bearing on 
syllable structure and it could as well be absent from it, as syllables of this kind 
would have no distinctive tone in either case and are therefore called short. 


Type of vowel Word Type of vowel Word 


nnd ‘row, line’ lazda ‘hazel’ 


2.4.4 Sino-Tibetan Languages 


This family has around 300 members and has 5-main branches viz. Tibetic (Bodic, 


Burmic, Bai, Karenic and Sintic). 


Sino- Tibetan 


! 


Sinitic Tibeto-Karen 
1(Mandarin) i 
Karen Tibeto- Burman 


{ 


2Mlizo,. ;sManipuri, «Bodo 


Fig 2/13: Sino-Tibetan Languages 
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2.4.4.1 Mandarin (Contour) belongs to contour language family. In order to 
differentiate meaning, the same syllable can be pronounced with different tones. 
Mandarin's tones give it a very distinctive quality, but the tones can also be a source 
of miscommunication if not given due attention. Mandarin is said to have four main 


tones and one neutral tone (or, as some say, five tones). 


— The first tone is flat just like walking on a flat smooth road. 


gee The second tone rises like going up hill. 
4 The third tone falls and rises like riding on a roller coaster 
a The fourth tone goes down fast all the way. 


The fifth neutral tone short and lightly spoken, and can be seen in the use of the word 
“ma” at the end of a sentence to make it a question. Each tone has a distinctive pitch 


contour which can be graphed using the Chinese 5-level system. 


S. Word 1 Word 2 Word 3 Word 4 Word 5 
No. L R FR F N 
1 hE ma FR ma & ma EB ma on ma 
“mother” “numb” “ horse” “scold” “question 
word” 
2 bi bi ba bi bi 
“force” “nose” “compare” “wall” 


Table 2/12: Tone levels in Mandarin 


2.4.4.2 Mizo Language (Contour) is a Tibeto-Burman language spoken in India, 
Bangladesh and Myanmar. Its tone system has been described and analyzed by native 
speakers (Chhangte 1986; Fanai 1989, 1992) as having four tones. Chhangte 
describes the Mizo tone inventory as including High, Rising, Falling, and an 
unmarked tone, where the unmarked is phonetically mid or low. Fanai also describes 
the four tones of Mizo as High, Low, Rising and Falling where the Low tone can also 
have an allophonic variation realized as an extra low tone. The four tones in Mizo can 


surface in monosyllabic, disyllabic and trisyllabic words. 
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S. Word 1 Word 2 Word 3 Word 4 Word 5 
No. HR FR HF LR LF 
1 Lum Lim Lum Lum Lim “roll 
“bushy” “to cheer up” | “to leave” “warm” down” 
2 tsan tsan tsan tsan Tsan 
“Soint” “To wait” “To warp” “Bird’s tail” “alone” 
3 buk bik buk bak buk 
“hut/camp” | “To tip up” | “To weigh” “Sound of sudden “bushy” 
incident” 
4 bok bok bok bok bok 
“knob” “Swaying to | “Temporary “also” “To lie 
one side” village” down” 


Table 2/13: Tone levels in Mizo 


2.4.4.3 Manipuri Language (Register) is a tonal language and has lexically 


significant & contrastive but relative pitch on each syllable. There are three types of 


tone viz. Falling, rising and level Inder Singh (1975), Chetan Singh (1976). 


Spectrographic analysis of Manipuri words reveals that phonemically only two tones 


are realised because the level tone occurring in certain words in isolation is replaced 


by rise-fall when preceded by roots containing the final tone. 


S. No. Word 1 Word 2 
L Level 
1 kanba kanbo 
“Hard” “To protect,etc 
2 taba tabo 
“to hear” “to falletc 
3 mi mi 
“man” “Shadow” 
4 Khon Khon 
“Les” “Canal” 


Table 2/14: Tone levels in Manipuri 


Level tone occurs in monosyllabic as well as polysyllabic words. It has two allotones 


viz. Level, unmarked in transcription and rise-fall marked as /’/. 
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S. No. Word 1 Word 2 
Low Low Rising falling 
1 /ma +pu/ [moapu] 
“His/her + to bring” “His/her mode of 
bringing” 
2 /mi +sin/ [misin] 
“Man+ Marker of 
plurality’ “Men” 
3 /so+mu/ [somu] 
“Animal +black” “Elephant” 


Table 2/15: Allophonic variations of level tone in Manipuri 


The four possible tonal sequences in Manipuri as discussed in chapter II of 


Shodhganga are: 
Tonal sequence Word Meaning 
Level + Level /kabok/ parched rice 
Level + Fall /kaphoy/ pomegranate 
Fall + Fall /khaba/ bitter 
Fall + Level /thamoy/, [thamdy] heart 


Table 2/16: Tonal sequences in Manipuri 


2.4.4.4 Bodo Language (Register) is a Tibeto-Burman language which is tonal. It 
is spoken mainly in the northern parts of the State of Assam in India. Garo, Boro, 
Rabha, Tiwa and Kokborok all belong to the Bodo subgroup. Boro is the major 
dialect. Linguistic development in Bodo is relatively new, hence there is dearth of 
proper research of its tonal phenomenon however the available research is 


summed up below. 


Bhattacharya (1977) described maximum four tones in Bodo language i.e. neutral, 
high, mid and low. Neutral tone is dependent on associated tone viz L/M/H and 
the quality of vowel whether centralized or more lax. In high tone, the level of 


pitch contour is level or rising and the quality of vowel is closer and tense. 
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In mid tone, the level of pitch contour is level or falling and the quality of vowel 
is medium as to closeness and tenseness. In the low tone, the level of pitch 


contour is falling and the quality of vowel is open and lax. 


S. Word 1 Word 2 Word 3 
No. L M H 
1 aid - ail / ait 
income goddess / mother 
2 Dan Dan Dani 
month to cut gift 
3 - eo1/eo4 eot 
to plough / to fry to clear by cutting 
4 k"ad k"al /k*ad - 
to test bitter to pluck / to tie 
3 Ond Oni / Oni - 
to open to love / powder of rice 


Table 2/17: Four way tone levels in Bodo 


Weidert (1987) also identifies the presence of tone in Bodo and opines that the 
tone patterns in Bodo are dependent on the syllable types and the consonantal 
specification of the syllable coda. Boro (1991) identifies a two-way tone system 


in Bodo which he describes as the rising and the falling tones. 


S. Word 1 Word 2 
No. L H 
1 doi dai- 
water lay egg 
2 tai- tai 
die blood 
3 Hor Or 
Night fire 
4 ka- ka- 
tie bitter 
5 seo- sao- 
rot burn 


Table 2/18: Two way tone levels in Bodo 
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Sarmah (2004) examined the autosegmental nature of tones using Optimality Theory. 
He advocated constraints viz ALIGN-L (DT, PRWD) says that each default tone 
should align with the left edge of the domain. Whereas ALIGN-R (PRWD, LT) says 
that the left edge of the domain should be specified with a lexical tone. However it 


needs further investigations. 


2.5 Indic Languages 


The Indo-Aryan or Indic languages are the dominant language family of the Indian 
subcontinent. They constitute a branch of the Indo-Iranian languages itself, a branch 
of the Indo-European languages family. Indo-Aryan speakers form about one-half of 
all Indo-European speakers (about 1.5 of 3 billion), and more than half of all Indo- 
European languages recognized by Ethnologue. While the languages are primarily 
spoken in South Asia, pockets of Indo-Aryan languages are found to be spoken in 
Europe and the Middle East. The largest in terms of native speakers 
are Hindustani (Hindi-Urdu, about 329 million), Bengali (242 
million), Punjabi (about 100 million), Dogri (4 million) and other languages, with a 


2005 estimate placing the total number of native speakers at nearly 900 million. 


Indic 
ee, ee 
Vedic- Sanskrit 
Prakri¥ (Pali) 


, 


Apabhrashas 


Hindi Kashmiri Bengali Gujarati Marathi Punjabi Dogri Sindhi Maithili Oriya 


Fig 2/14: Family of Indic languages 
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The Punjabi dialect continuum has clearly been determined to possess tonal features, 
although it has no genetic connection with other tonal languages, including those that 
are geographically proximate, such as Tibetan and Chinese. However Dogri is 
another tonal language in this family. Ghai (1991) studied phonetics and phonology 
of Dogri monosyllabic words & few disyllabic words and stated that the vowel 
quantity plays a position in the configuration of rules for stress in Dogri. The stress is 
phonetically realized by duration and pitch movement. Further she states that it is the 
stress feature that determines the place of the word tone. Tones in Dogri are due to 
tonemes and only single tone occurs on a simple word. She reported three tones in 
Dogri namely mid level, falling and rising tone. Kaul (2017) experimentally observed 
the tone in Dogri words. She also verified that the vowel bearing the falling-rising 


tone is the longest in duration. 


S.No. IPA Meaning Tone Average vowel 
duration of the nucleus 
1 /ca/ Peep Falling-rising 0.42 
2 /ca/ Tea Mid 0.33 
3 /ca/ Desire Low 0.16 


Table 2/19: Tone levels in Dogri 


2.6 Summary 


The literature survey of tonal languages of the various language families has been 
studied in detail to understand the types of tone and tone variations within and across 
languages. The presence of tone has been discussed in Punjabi and Dogri only among 
Indic Languages. Tones in Punjabi language and its experimental verification will be 


discussed in detail in the next chapter. 
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Tonogenesis of Punjabi: Experimental Observations 


3. Introduction 


Chapter 3 


Punjabi lexicon has closer ties with early Vedic Sanskrit & has also assimilated a 


wide array of words and expressions from Arabic and Farsi. The presence of pitch 


contours in Punjabi has been discussed by various linguists. 


Linguist A B C D 
Bailey, T, Low rising Ordinary High falling 
G(1914) 
Behal, Falling Even Rising 
K,C(1957) 
Sampat, K.S Falling Level Rising 
(1964) 
Gill, H.S & 
Gleason, Low Mid High 
H.A.(1969) 
Joshi, Tone 1 Tone2 Tone3 
S.S(1973) 
Sandhu, High Level Low 
B.S.(1974) 
Malik, High-falling Level Low- rising | Rising falling 
A.N(1994) 


Table 3/1: Eminent linguists’ description of Punjabi tones 


Joshi (1987) established through research studies that the vibrations of the vocal 


cords results in change of pitch and this change of pitch is used to distinguish words. 


Tone is observed only on one syllable and may co-occur with stress on it. If the class 


I vowel occurs in the first syllable, tone gets extended to the second syllable. 
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Although one word has one tone only but phonetically its effect is observed across 


syllables. 


Singh (2001) Punjabi has a lexically significant contrastive pitch accent (tone) which 
it makes use of to distinguish words which otherwise have identical phonetic form. 
The use of pitch by Punjabi to differentiate the meaning of various lexical items 1.e. 
words, establishes it as a tone language beyond any doubt. The author has studied the 
prosodic features in Paninian linguistics and has evolved the Moraic-Model for 
representing the prosodic features. Especially study on tones in Punjabi has been 


carried out in which he has identified presence of three tones in Punjabi. 


Sangha (2014) the low tone is characterized by lowering the voice below the normal 
pitch and then rising back in the following syllable. In the high tone the pitch of the 
voice rises above its normal level falling back at the following syllable. The level 
tone is carried by the remaining words. Thus there is a need to examine the lexical 
tone in this context in Punjabi. Low tone and high tone can occur in monosyllabic, 
disyllabic and trisyllabic environments. Following Examples illustrate that tone plays 


a significant role in the Punjabi lexicon as is evident from the minimal pairs given in 


the table: 
S. [a] /k/ [4] /p/ 
No. 
1 uy / kora/ ‘Horse’ Cirrus / para/ ‘Fare’ 
2 aat / kora/ ‘Whip’ ura / para/ ‘Difference’ 
3 ada = / kora/ ‘Leper’ uz / para/ “Student ’ 


Table 3/2: Tonal Minimal Pairs in Punjabi 


Thus Punjabi has three tones viz low tone /3/, high tone /3/, and mid tone /3/. Any 


vowel can be a tone carrier, however schwa is used as an example here. 
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The mid tone is never represented since it is predictable by rules of redundancy; if a 
vowel does not have any tone specification at the level of phonetic representation, by 
default it carries a mid tone. The tone placement also interacts with accent/stress. The 
low tone must be on the same syllable as the accent Bhatia (1993). Generally not 


more than one tone can occur in a single Punjabi word. 


Tones in Punjabi can be broadly discussed under two categories: 


Tone Arising from Supra-Laryngeal Consonants 


Punjabi has five voiced and aspirated consonants which are represented 


orthographically as: Y /g*/, ¥ /q3z*/, t /q*/, UO /d*/, 3 /b*/ also known as murmured 


consonants. These have disappeared and resulted into a tone. The tone is remnant of 
historically voiced aspirated consonants. If the murmured consonant was at the 
beginning of a word, it left behind a low tone; at the end, it left behind a high tone. If 
there was no such consonant, the pitch was unaffected; however, the unaffected 
words are limited in pitch and did not interfere with the low and high tones. That 
produced a tone of its own, mid tone. The historical connection is so regular that 
Punjabi is still written as if it had murmured consonants, and tone is not marked. The 
written consonants tell the reader which tone to use. A phoneme that is distinguished 


from another phoneme only by its tone is called Toneme. 


The tones in Punjabi arise as reinterpretation of different consonant series in terms of 


pitch viz four stops: YW /g*/, @ /d®/, 4 /d*/, 3 /b*/ and one affricate: ¥ /q3"/ and these 


five consonants are called Tonemes. The rules for characterization of Tonemes are 


described in the table below taking Toneme 4% /g#/ as an example as these are well 


documented in the linguistic studies. 
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Toneme Word/ Position | Nature | Toneme Substitution IPA 


Meaning of of Tone Transcription 
Toneme 
wy wd/ Home Initial | Low /3/ [k] /kor/ 
(voiceless unaspirate) 
wy HUH to Medial | Low /a/ [g] /magaia / 
(voiced unaspirate) 
Burn 
wz HU/ Name of | Final | High /a/ [g] /mag/ 
(voiced unaspirate) 
the month 


Table 3/3: Tone Marking Rules (Tonemes) 


Independent Tone 


Sandhu (1968), discussed that the aspiration effect of [J] /h/ in Pali, Prakrit and 


Apbhransh got developed into the tone system in Punjabi during middle Indo-Aryan 


period. Bailey (1914), stated that the tone resulting from the middle J /h/ occurs at 


the last syllable and in some cases it occurs on previous syllable. Tisdall (1953), 


identified that in ‘fad /keha/ & fod" /reha/, the pronunciation of consonant /h/ [J] is 
very weak and it does not act like an independent character. 

Singh (1991), consonant J /h/ is used in all word positions i.e. in initial, medial or 
final syllable. If J /h/ occurred at the end of words then it is not pronounced and ends 
with breathy force, which shows the occurrence of tone e.g. Utd /pi/ Grind; WS 
/t{a/ Tea. Similarly the J /h/ occurring in the middle position also acts a tone e.g. 


Hfdr /séd3/ Slowness; f¥Ja* /éna/_ these. 
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Sangha (1999), /h/ in initial position is pronounced as a consonant and is non-tonal 


e.g. Jet /hani/ companion; Jet /holi/ 


slow. The consonant J /h/ in the end of 


the word is realised as a high tone. The tone due to J /h/ in the medial position could 


be high or low depending on the context. 


Thus the tone rules are summarised below: 


consonant Position of Word / Meaning Nature of IPA 
consonant /h/ tone transcription 
in a word 
Final UJ / Wish High /a/ /tfa/ 
ae Medial feet / These High /é/ /éna/ 


Table 3/4: Tone Marking Rules (Consonant /h/) 


Conjuncts of /h/ consists of pairin /f/ e.g. UZ /pad/ to study. It does not occur in the 


initial syllable. The pronunciation of pairin /A/ in medial and final syllable is so weak 


that it is perceived as a tone as illustrated in the examples below: 


consonant | Position of /h/ | Word / Meaning Nature of IPA 
in a word tone transcription 
Conjuncts 
High /3/ /k®orava/ 
of /f/ uge" / Rough g 
Medial 
Hara / Seepage Low /a/ /solaba/ 


Table 3/5: Tone Marking Rules (Conjuncts of /fi/) 
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3.1 Methodology 
Tone is observed through the change in pitch viz fundamental frequency Fo as 
discussed in section 2.1. The methodology followed for experimental study of tones 


is as below: 


3.1.1 Criterion for Data Collection: The frequency analysis of corpus of | lakh 


sentences reveals that frequency of: 

a) words containing a toneme/s is about 10-15% 
b) words containing consonant /h/ is 15-20% 

c) words containing conjuncts of /fi/ is 1-5% 


Thus the data needs to be designed specifically for experimental work for tonal 


analysis as discussed in section 1.8.1. Word selection criteria will vary in context of: 


A) For Tone arising from Supra-Laryngeal Consonants, words with each of five 
tonemes in initial, medial and final syllable of the word will be compiled ensuring 
the phonetic coverage in terms of various vowels, dipthongs, nasalization, 
gemination and other co-articulation parameters such as occurrence of Toneme as 
onset/ coda in above contexts across Monosyllabic, Disyllabic, Trysyllabic and 
Polysyllabic words. 

B) For Independent Tones, the words containing consonant /h/ in initial, medial and 
final syllable of the word will be compiled to examine the tonal characteristics. 
Conjuncts of /h/ do not occur in the initial syllable hence words containing 


conjuncts of /f/ in medial and final syllable will be compiled. 
Data recording specifications were followed as elaborated in section 1.8.2 


3.1.2 Data Annotation using Praat Tool: The procedure to annotate the data in this 

tool is listed below: 

e Load a recorded wave file (.wav extension) by selecting “Read from file”. The 
file will appear in the objects list. 

e Click on “Annotate” and select “To Text Grid”. 
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The created TextGrid will appear in the object list of the object window. Selects 
both audio file and TextGrid file and click on “Edit”. 

The speech wave form gives information about the duration (horizontal axis) and 
loudness (vertical axis) of each part of the recording. In the spectrogram one can 
see the energy (shade of grey or black) at each point in time (horizontal axis) and 


each frequency (vertical axis). 


Speech Wave Form 


die 


Fig 3/1: Waveform & Spectrogram 


Formants (in red colour), the intensity curve (Yellow in Color), the pitch curve 
(Blue in Color) and the spectrogram (Gray part in Spectrogram) can be displayed 
or turned off, by clicking on the corresponding buttons on the top bar of the 
window. 

For setting a boundary (i.e. marking the beginning or end of a phoneme, syllable 
etc.) click on the appropriate place in the spectrogram. A blue circle appears on 


the tier. Boundary can be created by clicking on the circle. 


Fee Est Query View Select Interval Boundary Tier Spectrum Mach Intenuty formant Pulses 
| 


Fig 3/2: Boundary Tier for Phoneme Marking (bottom layer) 
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e After having created a second boundary, IPA transcription can be added for the 
given phoneme. Click on the grey button underneath allows it to play back this 
particular part of the recording. This button also gives the exact duration, pitch 


and intensity of the respective phoneme/syllable. 


dn 


ii il 


Fig 3/3: IPA Transcription Marking of Phonemes 


e Click the save button to save the file with .Collection extension in the given path. 


The first layer of Praat annotation tool was used for phoneme level annotation. 
Syllable marking can be done by adding second layer following the same procedure. 
Some samples were not annotated due to improper recording (IP) i.e. presence of 
noise or some other factor impacted the recording such as incorrect pronunciation 
including non-tonal (NT) pronunciation due to error by informant etc. These samples 


are limited to 10 % of the data and will be ignored for presentation of the data. 


3.1.3 Acoustic Parameters: The spectrographic analysis of all the samples will be 
carried out using praat tool. After identification of the vowel bearing tone (TBU) in a 
word being analysed, Fp contour and the slope of the contour over the pitch area of 
the TBU will be examined by recording of the parameters such as Fo, slope of pitch 
of the (TBU), quarter wise slope data of the pitch curve. The data sheets for each 
word will be recorded. The fundamental frequency is speaker dependent hence Fo can 
be analyzed for speaker variations also. The quarter-wise slope data will be correlated 
to detect the contour of the tone over duration of the TBU. 

The PRAAT graphs will be reported in the thesis. Some samples of Independent tone 
as discussed by Lata et all (2013), were verified using the MATLAB tool for which 
the code used for plotting the graphs is given at the end of appendix C. 
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The higher the slope of the pitch variation across TBU, the stronger is the tone 


pronounced by the speaker, which is generally the case with native speakers. Non- 


native speakers or urban speakers sometimes pronounce tonal words with weak tone. 


The onset of tone and realization of allotones will also be examined. The effect of 


tone on other syllables within a word also needs to be studied. Any single tone in a 


tonal language is susceptible to a good deal of variation owing to contextual 


compulsions. These patterns may also vary across mono-syllabic, di-syllabic, tri- 


syllabic and poly-syllabic words. Variations within a word may occur due to co- 


articulation and other factors as discussed below: 


The distance in tongue movement/ movement of lips between consecutive 
phonemes/syllables depending on the place of articulation and manner of 
articulation. 

The sonority of vowel bearing the tone. 

Variations across Tonemes & variations in Independent tone across words 
containing consonant /h/ and conjuncts of /fi/. 

The variations due to presence of gemination and dip- thongs. 

Speaker variations such as speaker dependency (stylistic variations / 
geographic variations etc) while recording, age variations of speakers, the 
trend of loss of tone among urban speakers and non- native speakers. 

Speakers may differ both in pitch height and in pitch range hence articulation 


of tone may vary from speaker to speaker. 


3.1.4 Notations to Represent Tone: Symbol [o] has been used in the following table 


to denote a tone bearing vowel. Following IPA symbols will be used in the thesis for 


marking tone in the representation of Punjabi PLS data. 


S. No. Types of tone IPA Notation 
ie High tone / Rising f°) LH 
2 Low tone / Falling ) HL 
3 Rising Falling ) LHL 
4 Falling Rising fe) HLH 


Table 3/6: Tone Marking Symbols 
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3.2 Experimental Analysis of Tone arising from Supra-Laryngeal Consonants 
Keeping in view the frequency of occurrence of words as discussed in section 3.1.1, 


the data samples were drawn for analysis as tabulated below: 


S. | Toneme | Mono Disyllabic Trisyllabic Polysyllabic | Tot 
No syllabic al 
Initial | Final | Initial | Medial | Final | Init |) Media 
ial ] 
1. zy 8 10 9 1 7 - 1 - 36 
2 v 9 8 8 3 5 - - 1 34 
3 a 8 4 7 4 6 4 - - 33 
4. gq 4 8 11 1 5 3 - 1 33 
5. 3 3 6 7 1 3 1 1 - 22 
Sub-total a2 36 42 10 26 8 Z 2 
Total 32 78 44 4 158 


Table 3/7: Size of Data Samples of Tonemes 


The word list of tonemes is given in Appendix A. 


Data Collation and Presentation 


The spectrographic analysis using PRAAT of all the male & female samples was 
carried out. The duration, fundamental frequency (Fo), quarter wise slope of the 
vowel associated with the Tone (TBU) have been recorded. The observations on 
contour of the tone over TBU have been tabulated. The tabulation of data has been 
done for various categories of words across the male and female speakers capturing 
the variety of acoustic environments as per Table 3/7 for studying the nature of the 


tone associated. 
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Recording of Data Sheets 
The phoneme level annotated data of above samples was used for recording various 


acoustic parameters. A sample data sheet is given below: 


Sample Data Sheet: 33 /bbd30/ 


Fo in Slope in Cross-Sectional —— of ~— of | Duration of 
(AZ/Sec) (AZ/Sec) _TBU : TBU 
a 
Staal al 7 

218 


204 


= Pee one call 
IA A eel el a 
| In | 008 


| 173 | 182 | 193 | 193 | 


Table 3/8: Data Sheet of Male Speakers 
Fema Fo in Slope in Cross-Sectional = of — of | Duration of 
ee (HZ/Sec) (HZ/Sec) TBU (HZ/Sec TBU 


pa | Pere | 


=e 
i Cc eae Rl 
| 319 | 332 | 


| 719 | 296 | 


i cl al ial 
Ere il cA A cl Gal Ril a 


Table 3/9: Data Sheet of Female Speakers 
The sample data sheets for each category of Tonemes are given in Appendix B. 


The rules reported as per literature review in section 3 will be corroborated and the 


variations discovered will be elaborately discussed. 
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3.2.1 Monosyllabic Words 


These words have been analysed under two categories depending on whether Toneme 


is appearing as coda or onset. 


Words with Toneme as Coda: 


S. No. Word Duration Fy Slope Contour 
Text & of tone 
IPA M F | Avg |M | F | Avg| M | F | Avg| M&F 
avg | avg avg | avg avg | avg 
yw 
1 fSu /dig/ | 0.24 | 0.29 | 0.27 | 213 | 315 | 264 | 329 | 315 | 322 LH 
2 sry /tag/ | 0.33 | 0.39 | 0.36 | 209 | 296 | 253 | 308 | 276 | 292 LH 
3 uby /pig/ | 0.31 | 0.39 | 0.35 | 224 | 318 | 271 | 246 | 216 | 231 LH 
4 Gu /ig/ | 0.33 | 0.32 | 0.33 | 240 | 313 | 277 | 326 | 397 | 362 LH 
5 Hy /mag/ | 0.29 | 0.29 | 0.29 | 202 | 268 | 235 | 244 | 238 | 241 LH 
if 
1 WE 0.32 | 0.42 | 0.37 | 209 | 303 | 256 | 278 | 560 | 419 LH 
/sad3/ 
2 Sz /béd3/ | 0.28 | 0.31 | 0.30 | 205 | 294 | 250 | 282 | 364 | 323 LH 
3 Gy /Sd3/ | 0.19 | 0.26 | 0.23 | 211 | 308 | 260 | 429 | 432 | 431 LH 
4 ay 0.30 | 0.38 | 0.34 | 206 | 278 | 242 | 290 | 351 | 321 LH 
/ba d3/ 
ue 
1 He /séd/ | 0.21 | 0.26 | 0.24 | 220 | 309 | 265 | 317 | 307 | 312 LH 
2 Sz /vdq]/ | 0.13 | 0.15 | 0.14 | 197 | 286 | 242 | 452 | 497 | 475 LH 
TU 
1 Wy /jsdd/ | 0.12 | 0.14 | 0.13 | 213 | 308 | 261 | 432 | 522 | 477 LH 
2 au /kd/ | 0.21 | 0.26 | 0.24 | 215 | 304 | 260 | 334 | 339 | 337 LH 
3 
1 Hts /dzib/ | 0.27 | 0.29 | 0.28 | 223 | 310 | 267 | 400 | 360 | 380 LH 


Table 3/10: Contour of Tone in Monosyllabic Words with Toneme as Coda 
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The rising tone is observed in all words. Sample word 3°Y /tag/ Anxiety: 


a 


Fig 3/4: Male Sample - LH 


Words with Toneme as Onset: 


Fig 3/5: Female Sample - LH 


S. Word Duration Fo Slope Contour 
No. Text & of tone 
IPA M F Avg | M F | Avg | M F | Avg M&F 
avg | avg avg | avg avg | avg 
w 
1 we /kdr/ 0.18 | 0.20 | 0.19 206 | 294 | 250 | 277 | 364 | 321 HL 
2 ~WA /kus/ 0.38 | 0.31 | 0.35 233 | 320 | 277 | 327 | 416 | 372 | HL (50%) 
_ HLH 
(50%) 
8 
1 Ba /tfag/ 0.13 | 0.17 | 0.15 234 | 298 | 266 | 280 | 459 | 370 HL 
2 ae /tfao/ 0.50 | 0.48 | 0.49 212 | 304 | 258 | 184 | 400 | 292 HLH 
3 Be /tfuth/ 0.29 | 0.29 | 0.29 232 | 314 | 273 | 314 | 393 | 354 | HL (40%) 
7 HLH 
(60%) 
g 
1 fez /tid]/ 0.12 | 0.14 | 0.13 233 | 316 | 275 | 608 | 633 | 621 HL 
2 ea /tér/ 0.29 | 0.29 | 0.29 198 | 297 | 248 | 288 | 419 | 354 HL 
3 0.32 | 0.36 | 0.34 223 | 303 | 263 | 226 | 272 | 249 | HL (50%) 
2S /tol/ HLH 
(50%) 
4 eel /tdg/ 0.23 | 0.27 | 0.25 219 | 300 | 260 | 288 | 279 | 284 HL 
5 wet /tai/ 0.45 | 0.46 | 0.46 205 | 301 | 253 | 190 | 321 | 256 HLH 
6 wet /thi/ 0.44 | 0.41 | 0.43 221 | 328 | 275 | 249 | 293 | 271 HLH 
Contd.. 
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S. Word Duration Fo Slope Contour 
No. Text & of tone 
IPA M F Avg | M F | Avg| M F | Avg 
avg | avg avg | avg avg | avg 
0 
1 de /tdn/ 0.15 | 0.16 | 0.16 | 231 | 308 | 270 | 273 | 427 | 350 HL 
2 Ua /tdy/ 0.19 | 0.23 | 0.21 | 228 | 299 | 264 | 282 | 371 | 327 HL 
3 fimre 0.21 ) 0.12 | 0.17 | 222 | 300 | 261 | 173 | 297 | 235 | HL (50%) 
: HLH 
/tian/ 
' (50%) 
4 Unit /tua/ 0.51 | 0.52 | 0.52 | 217 | 315 | 266 | 257 | 366 | 312 HLH 
g 
1 34y /pokk®/ | 0.14 | 0.12 | 0.13 | 230 | 322 | 276 | 653 | 891 | 772 HL 
2 au /pa d/ 0.30 | 0.29 | 0.30 | 230 | 319 | 275 | 244 | 352 | 298 HL 


Table 3/11: Contour of Tone in Monosyllabic Words with Toneme as Onset 


Falling tone is observed in the majority of the words. Example word 8dI /t{ag/ Foam: 


Cee ene 
(eer! ee 


Fig 3/6: Male Sample - HL 
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eee 
‘ween eee 


Fig 3/7: Female Sample - HL 


3.2.2 Di/ Tri/ Poly-syllabic Words with Toneme as Onset in Initial Syllable 


S. Word Duration FO Slope Contour 
No. | Text & IPA of tone 
M F Avg | M | F | Avg | M F | Avg| M&F 
avg | avg avg | avg avg | avg 
wy 

1 WT /koda/ | 9.20 | 0.21 | 0.21 | 210 | 284 | 247 | 298 | 294 | 296 HL 

2 watt /kari/ 0.13 | 0.13 | 0.13 | 220 | 300 | 260 | 415 | 400 | 408 HL 

3 UlAT /k3ssa/ | 0.08 | 0.07 | 0.08 | 213 | 307 | 260 | 655 | 479 | 567 HL 

4 wrt /kahi/ | 0.17 | 0.20 | 0.19 | 211 | 285 | 248 | 349 | 330 | 340 HL 

5 wat /kidi/ | 9.20 | 0.23 | 0.22 | 221 | 306 | 264 | 492 | 475 | 484 HL 

6 ff ait /kiggi/ | 9.09 | 0.09 | 0.09 | 226 | 312 | 269 | 782 | 670 | 726 HL 

7 Wet /kéna/ | 0.09 | 0.07 | 0.08 | 243 | 324 | 284 | 670 | 441 | 556 HL 

8 uw /kéra/ 0.21 | 0.22 | 0.22 | 218 | 293 | 256 | 281 | 297 | 289 HL 

9 UST /kALi/ 0.18 | 0.19 | 0.19 | 233 | 296 | 265 | 317 | 335 | 326 HL 

10 WeT /kota/ 0.13 | 0.14 | 0.14 | 226 | 299 | 263 | 566 | 486 | 526 HL 

11 war 0.19 | 0.18 | 0.19 | 235 | 315 | 275 | 359 | 375 | 367 HL 
/kurona/ 

¥ 

1 BS /tfada/ | 90.18 | 0.21 | 0.20 | 221 | 290 | 256 | 336 | 403 | 370 HL 

2 Bz tfapu/ | 0.22 | 0.25 | 0.24 | 221 | 290 | 256 | 251 | 348 | 300 HL 

3 fssaq 0.11 | 0.14 | 0.13 | 240 | 320 | 280 | 323 | 448 | 386 HL 
/tfitak/ 

4 sind 0.22 | 0.26 | 0.24 | 213 | 290 | 252 | 357 | 433 | 395 HL 
/tfadzor/ 

5 Sat /tfoli/ 0.18 | 0.23 | 0.21 | 232 | 301 | 267 | 319 | 388 | 354 HL 

6 gor 0.16 | 0.18 | 0.17 | 234 | 300 | 267 | 522 | 549 | 536 HL 
/tfutha/ 

7 vust 0.19 | 0.20 | 0.20 | 227 | 304 | 266 | 463 | 464 | 464 HL 
/tfspori/ 

8 vast 0.11 | 0.13 | 0.12 | 229 | 305 | 267 | 487 | 665 | 576 HL 
/tfagora/ 

9 Baer 0.10 | 0.09 | 0.10 | 240 | 318 | 279 | 937 | 833 | 885 HL 
/tfokona/ 

Contd.. 
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S. Word Duration FO Slope Contour 
No. | Text & IPA of tone 
M F Avg | M F | Avg | M F | Avg | M&F 
avg | avg avg | avg avg | avg 
id 
1 UAE tokkon/ | 0.07 | 0.07 | 0.07 | 228 | 313 | 271 | 835 | 576 | 706 HL 
2 ear /tilla/ 0.09 | 0.08 | 0.08 | 235 | 328 | 282 | 503 | 552 | 528 HL 
3 we /taba/ 0.13 | 0.15 | 0.14 | 215 | 299 | 257 | 426 | 502 | 464 HL 
4 wat /tadi/ 0.14 | 0.17 | 0.16 | 223 | 304 | 264 | 420 | 463 | 442 HL 
5 feeo 0.13 | 0.15 | 0.14 | 211 | 311 | 261 | 477 | 495 | 486 HL 
/tidora/ 
6 fesqer 0.07 | 0.08 | 0.08 | 240 | 335 | 288 | 517 | 566 | 542 HL 
/tilkova/ 
7 ufger 0.09 | 0.12 | 0.11 | 225 | 310 | 268 | 537 | 352 | 445 HL 
/{ahena/ 
8 gue" 0.17 | 0.20 | 0.19 | 231 | 304 | 268 | 476 | 488 | 482 HL 
/tudy na/ 
7 
1 OT /tobi/ 0.14 | 0.16 | 0.15 | 227 | 319 | 273 | 454 | 463 | 459 HL 
y) OOH /tanvf/ 0.06 | 0.08 | 0.07 | 224 | 320 | 272 | 386 | 506 | 446 HL 
3 Tet /t5da/ 0.20 | 0.24 | 0.22 | 221 | 298 | 260 | 312 | 446 | 379 HL 
4 Gare/tanad/ | 0.05 | 0.04 | 0.05 | 232 | 322 | 277 | 449 | 678 | 564 HL 
5 UH /tdram/ | 0.11 | 0.14 | 0.13 | 235 | 308 | 272 | 348 | 424 | 386 HL 
6 oat /téni/ 0.11 | 0.08 | 0.10 | 242 | 339 | 291 | 458 | 529 | 494 HL 
7 eer 0.15 | 0.16 | 0.16 | 228 | 312 | 270 | 507 | 600 | 554 HL 
todola/ 
g 
1 sat /pagi/ 0.17 | 0.22 | 0.20 | 206 | 320 | 263 | 309 | 429 | 369 HL 
2 3e /pddu/ 0.23 | 0.25 | 0.24 | 226 | 309 | 268 | 332 | 454 | 393 HL 
4 Syst 0.09 | 0.08 | 0.09 | 263 | 309 | 286 | 870 | 883 | 877 HL 
/pidzdzna/ 
5 sHat /pasuri/ | 0.07 | 0.06 | 0.06 | 237 | 320 | 279 | 585 | 953 | 769 HL 
6 fyAcud 0.08 | 0.08 | 0.08 | 224 | 315 | 270 | 583 | 764 | 674 HL 
/priftatfar/ 


Table 3/12: Contour of Tone in Di/ Tri/ Poly-syllabic Words with Toneme as Onset 


in Initial Syllable 
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The toneme as onset in the initial syllable of the word always bears a falling tone 


observed on the nucleus of the syllable e.g. AE /tdkkon/ Cover 


nO ee — i ee 0 
entre ee: 


Fig 3/8: Male Sample - HL Fig 3/9: Female Sample - HL 


Discussion 


It has been discussed in literature survey that the toneme as onset in initial syllable 


leads to falling tone which is corroborated for mono/di/tri/poly-syllabic words as is 


evident from Table 3/11 & 3/12. There is no reference in the literature about toneme 


as coda in the initial syllable but experimentally rising tone has been observed in case 


of monosyllabic words (refer table 3/10). Falling-rising tone has been observed in 


words having dipthong as an open vowel however it is observed in 50% of the 


speakers only in case of closed syllable as can be seen in far /tian/ as an example. 


In a monosyllabic word with toneme as onset and coda both, the toneme in coda gets 


substituted by corresponding voiced unaspirated consonant due to articulatory 


constraints e.g. au /pi d/. Such words occur very infrequently. 


The relevant Praat graphs are given in Appendix C. 
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The observations are summed up below: 


S. No. 


Acoustic environment 


Tone observed 


Allotones 


1 Monosyllabic with tonemes 


as coda 


Rising tone 


Mono/ Di/ Tri/ Poly- 


syllabic with tonemes as 


onset 


Dipthong in open 
syllable 


Falling tone 


Falling-rising 


Falling-rising 


3.2.3 Tri/ Poly-syllabic Words with Toneme in Medial Syllable 


Table 3/13: Tone Rules (refer Data Tables: 3/10, 3/11 & 3/12) 


S. Word Duration Fo Slope Contour 
No. Text & IPA of tone 
Mavg | Favg | Avg | M F | Avg | M F | Avg) M&F 
avg | avg avg | avg 
wy 

1 shyer 0.08 0.08 | 0.08 | 225 | 319 | 272 | 264 | 467 | 366 LH 
/tagara / 

2 fewgar 0.10 0.10 | 0.10 | 223 | 313 | 268 | 290 | 327 | 309 LH 
/nigorna/ 

3 Gsuyer 0.08 0.08 | 0.08 | 217 | 310 | 264 | 241 ) 333 | 287 LH 
/olagara / 

4 Quer /agdna/ | 9.09 0.25 | 0.17 | 232 | 317 | 275 | 291 | 414 | 353 LH 

5 Upygar 0.09 0.11 | 0.10 | 229 | 304 | 267 | 228 | 339 | 284 LH 
/pagerna/ 

6 fewrgar 0.24 0.23 | 0.24 | 206 | 296 | 251 | 212 | 300 | 356 HL 
/nigarna/ 

7 Uy 0.19 0.20 | 0.20 | 216 | 306 | 261 |) 220 | 427 | 324 HL 
/pagura/ 

8 user 0.17 0.17 | 0.17 | 218 | 278 | 248 ) 221 | 267 | 244 HL 
/k*3galna/ 

Contd.. 
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S. Word Duration Fo Slope Contour 
No. Text & of tone 
IPA M F Avg | M F Avg | M F | Av| M&F 
avg | avg avg | avg avg | avg | g 
¥ 
1 ve" 0.09 | 0.08 | 0.09 | 227 | 315 | 271 | 262 | 492 | 377 LH 
/ri& onja/ 
2 Hite 0.10 | 0.10 | 0.10 | 212 | 301 | 257 | 335 | 482 | 409 LH 
/sadzidar/ 
3 ayer 0.08 | 0.09 | 0.09 | 228 | 309 | 269 | 198 | 406 | 302 LH 
/bodgana/ 
4 AHSeTt 0.06 | 0.09 | 0.08 | 210 | 307 | 259 | 269 | 382 | 326 LH 
/somdzada 
ri/ 
5 fars"eet 0.23 | 0.27 | 0.25 | 216 | 302 | 259 | 161 | 258 | 210 HL 
/gidzaura/ 
6 @sTgar 0.21 | 0.22 | 0.22 | 218 | 288 | 253 | 237 | 323 | 280 HL 
/odgarna/ 
cv} 
1 ger 0.08 | 0.08 | 0.08 | 210 | 286 | 248 | 192 | 238 | 215 LH 
/tuddra / 
2 Heer 0.07 | 0.07 | 0.07 | 228 | 328 | 278 | 238 | 300 | 269 LH 
/sddare / 
3 JeEny 0.07 | 0.08 | 0.08 | 225 | 312 | 269 | 279 | 484 | 382 LH 
/hddonsar/ 
4 Hest 0.17 | 0.18 | 0.18 | 226 | 306 | 266 | 275 | 349 | 312 HL 
/sddéla/ 
5 feeo 0.20 | 0.24 | 0.22 | 214 | 297 | 256 | 200 | 371 | 286 HL 
/tidora/ 
6 ger 0.15 | 0.16 | 0.16 ) 214 | 285 | 250 | 519 | 592 | 556 HL 
/budapa/ 
Contd.. 
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S. Word Duration Fo Slope Contour 
No. Text & of tone 
IPA M avg Favg | Avg | M F | Avg | M F | Avg; M&F 
avg | avg avg | avg 
q 
1 feud 0.05 0.06 0.06 | 244 | 320 | 282 | 289 | 526 | 408 LH 
/ki@ 16/ 
2 wae 0.06 0.05 0.06 | 228 | 319 | 274 | 249 | 354 | 302 LH 
/g3@ la/ 
3 vugyEt 0.06 0.05 0.06 | 237 | 318 | 278 | 291 | 589 | 440 LH 
/tfoddrpon, 
a/ 
4 wiatgr 0.19 0.20 0.20 | 222 | 304 | 263 | 204 | 336 | 270 HL 
/3c ra/ 
5 ACTS 0.21 0.23 0.22 | 221 | 296 | 259 | 161 | 274 | 218 HL 
/sodaron/ 
3 
1 gget 0.08 0.06 0.07 | 245 | 321 | 283 | 162 | 377 | 270 LH 
/rdbare / 
2 Bg 0.07 0.07 0.07 | 234 | 317 | 276 | 253 | 430 | 342 LH 
/labbana/ 
3 fesee 0.22 0.22 0.22 | 227 | 311 | 269 | 201 | 286 | 244 HL 
/ntbauna/ 


Table 3/14: Contour of Tone in Tri/ Poly-syllabic Words with Toneme in Medial Syllable 


It is observed from the above table: 


TUS /gdddla/ Muddy 


Fig 3/10: Male Sample - LH 
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The medial syllable containing short vowel and toneme results in rising tone e.g. 


ope 0 me 


eee 


Fig 3/11: Female Sample - LH 


e The medial syllable containing long vowel and Toneme results in falling tone e.g. 


AMS /sadaran/ Simple 


Fig 3/12: Male Sample - HL Fig 3/13: Female Sample - HL 


Discussion 


The rising / falling tone is observed in case of tonemes in medial syllable depending 


on whether the TBU is short / long vowel respectively. 


The observations are summed up below: 


S. No. Acoustic environment Tone observed Allotones 


1 Tri / Poly-syllabic with Rising - 
tonemes in medial syllable 


and short vowel as TBU 


2 Tri / Poly-syllabic with Falling - 
tonemes in medial syllable 


and long vowel as TBU or 


dipthong (long + short) 


Table 3/15: Tone Rules (refer Data Table: 3/14) 
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3.2.4 Di-syllabic Words with Toneme in Final Syllable 


S. Word Duration Fy Slope Contour 
No. | Text & IPA of tone 
Mavg| F Avg | M F | Avg | M F | Avg| M&F 
avg avg | avg avg | avg 
wy 
1 feu /nigga/ | 9.07 0.06 | 0.07 | 190 | 278 | 234 | 401 | 417 | 409 | LH 
2 ault /kSgi/ 0.06 0.22 | 0.19 | 209 | 300 | 255 | 283 | 233 | 258 | LH 
3 Suft /bdggi/ 0.07 0.07 | 0.07 | 193 | 263 | 228 | 287 | 293 | 290 | LH 
4 BY /logu/ 0.08 0.09 | 0.09 | 195 | 273 | 234 | 193 | 363 | 278 | LH 
5 nowy 0.13 0.16 | 0.15 | 219 | 306 | 263 | 404 | 343 | 374 | HL 
/ork 3y/ 
6 fS4A /nigas/ | 9.39 0.39 | 0.39 | 216 | 299 | 258 | 186 | 386 | 286 | HLH 
7 ws /tfigat/ | 9.39 0.34 | 0.37 | 210 | 280 | 245 | 144 | 241 | 193 | HLH 
¥ 
1 JZ /hddzu/ 0.16 0.19 ) 0.18 | 193 | 278 | 236 | 349 | 243 | 296 | LH 
2 BE /bé dzo/ | 0.08 0.08 | 0.08 | 194 | 280 | 232 | 358 | 639 | 499 | LH 
3 He /dzadzu/ | (0.17 0.21 / 0.19 | 191 | 270 | 233 | 237 | 300 | 269 | LH 
4 we /atfakk/ | 0.12 0.13 | 0.13 | 229 | 313 | 271 | 511 | 488 | 500 | HL 
5 ASI /sédza/ | 0.45 0.36 | 0.41 | 203 | 301 | 252 | 180 | 287 | 234 | HLH 
6 ASS 0.47 0.42 | 0.45 | 206 | 299 | 253 | 174 | 378 | 276 | HLH 
/sodzai/ 
3 
1 Hes 0.10 0.13 | 0.12 | 207 | 247 | 277 | 146 | 254 | 200 | LH 
/sid 1/ 
2 wes 0.15 0.16 | 0.16 | 235 | 331 | 283 | 146 | 228 | 187 | LH 
/bid d1/ 
3 Ae 0.18 0.21 | 0.20 | 205 | 318 | 262 |) 227 | 341 | 284 | LH 
/s3d_ al 
4 Mize 0.08 0.13 | 0.11 | 226 | 336 | 281 | 190 | 268 | 229 | LH 
/goad dn/ 
5 ae 0.16 0.21 / 0.19 | 199 | 313 | 256 | 184 | 410 | 297 | LH 
/kSch / 
6 Det /tfiidi/ 0.21 0.24 | 0.23 | 231 | 324 | 278 | 269 | 338 | 304 | LH 
Contd.. 
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S. Word Duration Fy Slope Contour 
No. | Text & IPA of tone 
Mavg| F Avg | M F | Avg | M F | Avg| M&F 
avg avg | avg avg | avg 
7 ae" /bidda/ | 0.08 0.07 | 0.08 | 189 | 278 | 234 | 484 | 619 | 552 | LH 
8 amet 0.28 0.32 | 0.30 | 217 | 285 | 251 | 195 | 268 | 232 | LH 
/guadi/ 
9 Ward /tanad/ | 9.33 0.36 | 0.35 | 221 | 307 | 264 | 170 | 284 | 227 | HL 
(60%) 
HLH 
(40%) 
10 Jee /ts dau/ | 0.44 0.43 | 0.44 | 219 | 285 | 252 | 165 | 318 | 242 | HLH 
11 sere /kodai/ | 0.48 0.45 | 0.47 | 223 | 300 | 262 | 162 | 348 | 255 | HLH 
"q 
1 f&ag /iddér/ | 0.10 0.15 | 0.13 | 236 | 331 | 284 | 241 | 225 | 233 | LH 
2 HUd /mod r/ | 0.17 0.17 | 0.17 | 195 | 334 | 265 | 219 | 304 | 262 | LH 
3 eum 0.38 0.36 | 0.37 | 229 | 318 | 274 | 207 | 291 | 249 | LHL 
/doudia/ 
4 WITT /Sdl a/ 0.30 0.23 | 0.27 | 224 | 268 | 246 | 238 | 547 | 393 | LH 
5 Aah /sddi/ 0.17 0.21 | 0.19 | 205 | 304 | 255 | 228 | 312 | 217 | LH 
6 UO /khgda/ (| 0.11 0.12 | 0.12 | 169 | 273 | 221 | 351 | 344 | 348 | LH 
7 Tt /gdc / 0.9 0.10 | 0.10 | 185 | 279 | 232 | 264 | 355 | 310 | LH 
8 GO /gidda/ | 9.06 0.06 | 0.06 | 184 | 279 | 232 | 457 | 571 | 514 | LH 
9 TT /gidda/ | 9.06 0.07 | 0.07 | 196 | 287 | 242 | 393 | 573 | 483 | LH 
10 yore 0.35 0.38 | 0.37 | 191 | 295 | 243 | 149 | 239 | 194 | HLH 
/prodan/ 
11 Aue /kSdii/ | 0.41 0.39 | 0.40 | 201 | 323 | 262 | 165 | 261 | 213 | HLH 
3 
1 usd 0.11 0.17 | 0.14 | 222 | 335 | 279 | 165 | 313 | 339 | LH 
/dubbst/ 
2 TS /gordb/ | 0.11 0.15 | 0.13 | 187 | 292 | 240 | 219 | 413 ) 316 | LH 
3 Wes 0.10 0.10 | 0.10 | 191 | 283 | 237 | 232 | 334 | 283 | LH 
/dorlabb/ 
Contd.. 
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S. Word Duration Fy Slope Contour 

No. | Text & IPA of tone 

Mavg| F Avg | M F | Avg | M F | Avg | M&F 
avg avg | avg avg | avg 

4 Sat /nAbi/ 0.15 0.15 | 0.15 | 192 | 273 | 233 | 258 | 214 | 236 | LH 

5 eS" /t6ba/ 0.15 0.13 | 0.14 | 209 | 292 | 251 | 217 | 349 | 283 | LH 

6 Tas /darba/ | 9.0.09 | 0.11 | 0.10 | 208 | 264 | 236 | 232 | 275 | 254 | LH 

7 fSd3 /nirbe/ | 0.11 0.10. | 0.11 | 197 | 284 | 243 | 235 | 367 | 301 | LH 

8 wisn 0.40 0.41 | 0.41 | 189 | 302 | 246 | 184 | 369 | 277 | HLH 

/abias/ 
9 TS /gdtir/ | 0.27 0.30 | 0.29 | 222 | 295 | 259 | 208 | 420 | 314 | HL 


Table 3/16: Contour of Tones in Di-syllabic Words with Toneme in Final Syllable 


The majority of the words reflect rising / falling tone depending on the context as per 


detail given below: 


e Rising tone is observed in words having open final syllable. It is also observed that 


tone doesn’t reflect on the open vowel in the end of a word and it shifts to the prior 


vowel e.g. SST /nabi/ Navel 


eet 
L-.___ ] 


Fig 3/14: Male Sample - LH 
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oe 
ee ene 


Fig 3/15: Female Sample - LH 


e Falling tone is observed in words having closed final syllable e.g. 


JIS /gadbir/ Serious 


Fig 3/16: Male Sample - HL Fig 3/17: Female Sample - HL 


Discussion 


Rising tone has been observed in words having open final syllable. In addition, the 
tone gets shifted to prior vowel as investigated. Falling tone has been observed in 


words having closed final syllable. Falling-rising tone has been observed in words 


having dipthong (short + long) and (long + long) with an exception in GUMT /dudia/ 


where rising-falling tone is observed. Falling-rising tone is observed in case of long 


vowel being TBU due to fricative, flap and nasal coda. 
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The observations are summed up below: 


onset in final closed 


syllable 
e Dipthong (short + Falling - 
long) and (long + rising 
long) and flap / 


fricative / nasal coda 


S. No. Acoustic environment Tone observed Allotones 
Tone Shifting of tone 
contour on prior vowel 
1 
Di-syllabic with toneme in | Rising tone Yes - 
final open syllable 
e Dipthong (long + Rising - 
long) falling 
2 
Di-syllabic with toneme as | Falling tone | Not applicable HLH 


Table 3/17: Tone Rules (refer Data Table: 3/16) 
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3.2.5 Tone Patterns in Composite Words 


S. Word Duration FO Slope Contour 
No. Text & of tone 
IPA M F Avg | M | F | Avg) M F Avg M&F 
avg | avg avg | avg avg | avg 
! wy - WEY /kdllukara/ 
Ws 0.07 | 0.07 | 0.07 | 232 | 302 | 267 | 608 | 500 | 554 HL 
/kallu/ + 
WF kara 0.15 | 0.17 | 0.16 | 223 | 301 | 262 | 216] 339 | 278 HL 
/ 
BS -Zosa' /tfontfana/ 
1 BS /tfon/ | 9.11 | 0.10 | 0.11 | 237 | 329 | 283 | 452 | 506 | 479 HL 
+ 
salit/ana 0.07 | 0.07 | 0.07 | 225 | 288 | 257 | 233 | 332 | 283 HL 
/ 
3 -foHfSH /rimtfim/ 
2 | fH /rm/ 
+ 
fay 0.15 | 0.13 | 0.14 | 231 | 329 | 280 | 267 | 317 | 292 LH 
/tfim/ 
q - SHOT /namtari/ 
1 lon 
/nam/ + 
ort 0.15 | 0.16 | 0.16 | 226 | 317 | 272 | 254 | 338 | 296 HL 
/tari/ 
3 - 33515 /pepit/ 
1 3 /pa/ 0.15 | 0.12 | 0.14 | 223 | 305 | 264 | 248 | 428 | 338 HL 
+ 
HZ /pit/ 0.20 | 0.22 | 0.21 | 225 | 280 | 253 | 332 | 565 | 449 HL 
B&W - CsA /bdzbbg/ 
1 OF /sdz/ | 9.06 | 0.04 | 0.05 | 217 | 284 | 251 | 509 | 739 | 624 LH 
+ 
sey _ 
a 0.09 | 0.10 | 0.09 | 226 | 321 | 286 | 185 | 564 | 338 
/bog/ 


Table 3/18: Contour of tones in Composite Words 
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The tone rules as discussed in previous sections are fully applicable to the constituent 


members of the composite words. An example word is SO8o" /tf{ontfana/ Sound 


making toy is shown below: 


€ 
—— ee — ; meta oe 
ee nee ne ey 


Fig 3/18: Male Sample - HL Fig 3/19: Female Sample - HL 


3.2.6 Research Findings on Tone arising from Supra-Laryngeal Consonants 


e As per literature survey the toneme in initial position leads to falling tone which 
has been corroborated experimentally and holds good for mono/ di/ tri/ poly- 
syllabic words with toneme as onset in the initial syllable. However falling-rising 
allotone has been observed in 50% cases. Falling-rising tone has been observed in 
all cases having dipthong as coda. 

e In addition falling tone has also been observed in tri / poly-syllabic words with 
toneme in medial syllable and long vowel as TBU and dipthong (long + short). 
Falling tone has been observed in di-syllabic words with toneme as onset in final 
closed syllable. However falling-rising tone has been observed in dipthong (long + 
short) & (long + long) and flap / fricative / nasal coda. 

e Mono-syllabic words with toneme as coda in initial syllable testify rising tone and 
rising tone has also been observed in tri / poly-syllabic words with toneme in 
medial syllable and short vowel as TBU. 

e In addition rising tone has also been observed in di-syllabic words with toneme in 
final open syllable. Rising-falling tone has been observed in dipthong (long + long 
vowel). 
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These findings are summed up in the table below: 


Occurrence of Acoustic environment Tonal 
Toneme in Variations 
Initial syllable 1. Monosyllabic with toneme as coda LH (100%) 
2. Monosyllabic with toneme as onset HL (50%), HLH 
e Dipthong (50%) 
HLH (100%) 
Tri / Poly-syllabic with toneme as onset HL (100%) 
Medial syllable 1. Tri-syllabic with toneme and short as LH (100%) 
TBU 
2. Tri-syllabic with toneme and long vowel 
as TBU or dipthong (long + short vowel) HL (100%) 
Final syllable 1. Di-syllabic with toneme in final open LH (100%) 


syllable 


With Dipthong (long + long) 


2. Di-syllabic with final closed syllable 
e Toneme as onset 
e Toneme as coda 
e Dipthong (short + long) and (long + 


long) and flap / fricative / nasal coda 


(tone shifts to 


prior vowel) 


LHL (100%) 


HL (100%) 


HL (60%), HLH 


(40%) 


HLH (100%) 


Table 3/19: Rules for Tone arising from Supra-Laryngeal Consonants 
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3.3 Experimental Analysis of Independent Tones 


In this case, the words containing consonant J /h/ in Initial/Medial/Final syllable and 


conjuncts of /fi/ in the medial/final syllable only were gathered as conjuncts of /fi/ in 
the initial syllable doesn’t orthographically occur. Initial /h/ is extensively used in 
orthography and as per literature survey it is considered non-tonal, which will be 


verified. The data sampling for study of independent tones is as below: 


Words Mono Disyllabic Trisyllabic Polysyllabic Total 
Consisting of | syllabic ical | Final | Initial | Medial | Final | Initial | Medial 
Consonant /h/ 15 15 16 6 16 3 - 3 74 
Conjuncts of 1 - 6 - 8 - - 2 17 

/fi/ 
Sub-Total 16 15 22 6 24 3 : 5 
Total 16 37 33 5 91 


Table 3/20: Size of Data Samples for study of Independent Tones 


The corresponding word lists are given in Appendix A. 


Data Collation and Presentation 


The spectrographic analysis using PRAAT of all the male & female samples was 
carried out. The duration, fundamental frequency (Fo), quarter wise slope of the 
vowel associated with the Tone (TBU) have been recorded. The observations on 
contour of the tone over TBU have been tabulated. The tabulation of data has been 
done for two categories of words (consonant /h/ & conjuncts of /fi/) capturing the 
variety of acoustic environments as discussed in section 3.1.3 for studying the nature 


of the tone associated accross the male and female speakers. 


These tones can be broadly divided into two categories: 
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3.3.1 Tone Variations Associated with Consonant /h/ 


Recording of Data Sheets 


The phoneme level annotated data of above samples was used for recording various 


acoustic parameters as discussed in section 3.1.3. Sample data sheets are given 


below: 


Sample Data Sheet 1 (consonant /h/): Jad /taba/ 


Cross - Sectional slope of 
TBU (Hz/sec 

ui 
16 
ui 
7H 
i 
Te 


Table 3/21: Data Sheet of Male Speakers 


Female Foin | Slope in Cross — Sectional slope of Contour | Duration 
Speakers | (Hz/sec) | (Hz/sec) TBU (Hz/sec of Tone of TBU 


be] be 
ed neal 


I 


Faverage | 298 | 306 | 266 | 279 | a0 | 03 | LH | 038 _| 


+=] 2] Go 


id 


Table 3/22: Data Sheet of Female Speakers 
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Sample Data Sheet 2 (conjuncts of /f/): ABet /ktuldna/ 


Cross - a of | Contour | Duration 


Hi ofTone | of TBU 


ma a 
[a 
ea Papeete aoe 


Contour | Duration 
(Hz/sec) ofTone | of TBU 


re [209 [190 [ae [96 [or [a0 |e 009 
average [312] ef aos [soe [ais [a] ce [ts 


Table 3/24: Data Sheet of Female Speakers 
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3.3.1.1 Monosyllabic Words 


S. Word Duration Fy Slope Contour of 

No. Text & tone 

IPA M | F | Avg |M/|F |Avg| M/F |Avg| M&F 

avg | avg avg | avg avg | avg 

1 OT /k*6/ 0.43 | 0.35 | 0.39 212 | 319 | 266 | 229 | 348 | 289 | LH (60%) 
LHL (40%) 

2 Wd /ga/ 0.51 | 0.31 | 0.41 192 | 284 | 238 | 191 | 361 | 276 | LH (60%) 
LHL (40%) 

3 WIT /4/ 0.37 | 0.25 | 0.31 192 | 271 | 232 | 242 | 373 | 308 | LH (50%) 
LHL (50%) 

4 ord /tfa/ 0.43 | 0.34 | 0.39 200 | 300 | 250 | 347 | 319 | 333. | LH (50%) 
LHL (50%) 

5 ed /tfr6/ 0.42 | 0.32 | 0.37 199 | 299 | 249 | 220 | 598 | 409 | LH (50%) 
LHL (50%) 

6 eT /va/ 0.31 | 0.29 | 0.30 192 | 276 | 234 | 216 | 502 | 359 | LH (50%) 
LHL (50%) 

7 Bd /16/ 0.43 | 0.29 | 0.34 217 | 287 | 252 | 214 | 472 | 404 | LH(30%) 
LHL (70%) 

8 Sako (| 0.48:*|:0.29 [0.39 | 199 | 311 | 255 | 188 | 434 1311 | LH (20%) 
LHL (80%) 

9 Ud /khu/ 0.43 | 0.33 | 0.38 219 | 310 | 265 | 231 | 541 | 386 | LH(20%) 
= LHL (80%) 

Concluding the nature of tone across above 9 words LH (50%) 
LHL (50%) 

10 Sd /t6/ 0.44 | 0.36 | 0.40 201 | 298 | 250 | 297 | 420 | 359 | HL (30%) 
HLH (70%) 

11 wd /ta/ 0.36 | 0.36 | 0.36 199 | 283 | 241 | 250 | 380 | 315 | HL (20%) 
HLH (80%) 
12 | wfr/se/ | 0.24 | 0.22 [0.23 | 198 | 289 | 244 [333 | 522 | 428 | LHL (30%) 
HLH (70%) 


Table 3/25: Contour of Independent tone in Mono-syllabic Words 
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The tone variations may be seen from the following examples: 


e Jd /tfa/ To wish — observed LH in 50% of the speakers and allotone LHL in rest 


50% of the speakers 


= 
| te tat Quay ew Sete: bem Gourctery Tar Sgecrum Pece btarety Femmurt Pubes 


Fig 3/20: Male Sample - LHL Fig 3/21: Female Sample - LH 


e wd /{a/ Fall - allotone HLH observed in 80% of the speakers, toneme being onset 


of the monosyllabic word whereas HL is seen only in 20% of the speakers 


jt Mlewiegme ss 
Ym (ee Gary Yew feet ewe ovntny Tar fontwe Mich berety fomet Aan 


Fig 3/22: Male Sample - HLH Fig 3/23: Female Sample - HLH 
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Discussion 


The tones reported by linguists as discussed in section 3 indicate high tone in OJ 


/tfa/. The data above indicates that consonant /h/ when orthographically attached as a 
coda in the monosyllabic words results in allotone (LHL) in 50% of the speakers. The 
variation in tone pattern observed due to onset being a toneme reflects the presence of 


falling tone coupled with allotone HLH in 75% cases. The fricative onset in case of 


Afd /sé/ leads to two allotones viz LHL in 30% speakers and HLH in 70% speakers. 


The above observations are summed up below: 


S. No. Acoustic environment Tone observed Allotones 
1 Monosyllabic with Rising tone Rising-falling in 
consonant /h/ as coda 50% of the speakers 
2 Monosyllabic with Falling tone Falling-rising in 
consonant /h/ as coda and 75% of the speakers 


toneme as onset 


i) Monosyllabic with - Rising-falling in 
consonant /h/ as coda and 30% 
fricative consonant as Falling-rising in 
onset 70% of the speakers 


Table 3/26: Tone Rules (refer Data Table: 3/25) 
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3.3.1.2 Di-syllabic Words (with consonant /h/ as coda in initial syllable) 


S. Word Duration Fo Slope Contour of 
No. | Text & tone 
IPA M F Avg | M | F | Avg; M | F | Avg M&F 
avg | avg avg | avg avg | avg 
1 | ftgear 0.15 | 0.16 | 0.15 | 142 | 252) 197 | 171 | 326) 248 | LH (40%) 
NT (60%) 
/éna/ 
2 | nrger 0.18 | 0.13 | 0.15 | 141 | 230 | 185 | 155 | 149 | 152 | LH (50%) 
NT (50%) 
/ala/ 


Composite word: HfJHS" /sésuba/ 


1 | Afe /se/ 0.20 | 0.18 | 0.19 | 182 | 264 | 223 | 414 | 441 | 427 | HLH (60%) 


NT (40%) 
+ 
As"/suba/ 
Table 3/27: Contour of Independent Tone in Di-syllabic Words 
(with consonant /h/ as coda in initial syllable) 
Discussion 


The tone reported by linguists taking fda" /éna/, as an example, as discussed in 


section 3 indicate high tone considering presence of orthographic consonant /h/ being 
in the medial syllable, however as per the hypothesis of word categorization followed 
in the present investigation, orthographically consonant /h/ is onset in the initial 
syllable of this disyllabic word. Rising tone is observed in 40% of the speakers. It is 
observed while annotating the data that 60% of the speakers (50% male & 50% 


female) have articulated consonant /h/ which reveals the trend of loss of tone amaong 


some speakers. Similarly it is observed that 50% of the speakers have recorded WIS" 


/ahola/ as non-tonal and rising tone is observed in the rest. 
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HAfd /sé/ is monosyllabic (first part of the composite word viz AfJHS™ /sésuba/) which 


was discussed in section 3.3.1.1. Accordingly HLH tone has been observed in 60% of 


the speakers and rest of the speakers has pronounced the consonant /h/. 


The above observations are summed up below: 


S. No. 


Acoustic environment 


Tone observed 


Allotones 


1 


Di-syllabic with consonant 
/h/ as coda in initial 


syllable 


Rising 


Composite word with 
consonant /h/ as coda in 


initial syllable 


Falling-rising 


Table 3/28: Tone Rules (refer Data Table: 3/27) 


3.3.1.3 Di / Tri-syllabic Words (with consonant /h/ as coda in final syllable) 


S. Word Duration Fo Slope Contour of 
No. | Text & tone 
IPA M F Avg | M F Av | M | F | Av M&F 
avg | avg avg | avg g | avg |avg| g 
1 sug 0.39 | 0.34 | 0.40 | 200 | 293 | 264 | 222 | 306 | 217 | LH (60%) 
taba LHL (40%) 
2 sag 0.39 | 0.32 | 0.36 | 196 | 216 | 206 | 188 | 521 | 355 | LH (80%) 
0 
ee, NT (20%) 
3 eA 0.40 | 0.29 | 0.35 | 209 | 244 | 227 | 212 | 485 | 349 | LH (80%) 
ieaeti LHL (20%) 
4 fenrg 0.43 | 0.35 | 0.39 | 194 | 293 | 244 | 233 | 329 | 281 | LH (80%) 
nr/ LHL (20%) 
5 feedg | 040 | 0.29 | 0.35 | 219 | 223 | 221 | 185 | 442 | 314 | LH (40%) 
ican! LHL (40%) 
dinate NT (20%) 


Table 3/29: Contour of Independent Tone in Di/Tri-syllabic Words 


(with consonant /h/ as coda in final syllable) 
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In the trisyllabic word fEdT /vidord/ rebellion, allotone LHL is observed in 40% of 


the speakers, whereas LH is seen in 40% of the speakers and 20% of the speakers 


have pronounced the consonant /h/. 


© Bi Tenn vidred new 
[Fle de Query Wen Select Imerval Boundary Tar Spectrum Pitch imenaty Feomant Pubes 


Visible part 1101655 seconds 
Total duran 1.101655 seconds 


se 


Fig 3/24: Male Sample - LHL Fig 3/25: Female Sample - LH 


Discussion 
Rising tone was observed in 70% of the speakers and rising-falling in 20% speakers. 


A trend of loss of these tones has been observed. 


S. No. Acoustic environment Tone observed Allotones 


1 Di-syllabic / Tri-syllabic Rising Rising-falling 
with consonant /h/ as coda 
in final syllable 

Table 3/30: Tone Rules (refer Data Table: 3/29) 
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3.3.1.4 Independent Tone Rules Associated with Consonant /h/ 


Based on the above discussions, the tone rules are summed up below: 


S. | Independent | Corroboration | Acoustic environment | Observations 
No. Tone of tone on tone 
variations / 
Allotone 
1 Consonant LH Monosyllabic with LHL (50%) 
/h/ consonant /h/ as coda 
e Fricative onset see 
LHL (30%) 
e Composite word 
with consonant /h/ 
in initial syllable pais 
NT (40%) 
Di-syllabic with 
consonant /h/ as coda in - 
initial syllable 
Di/ Tri-syllabic with 
consonant /h/ as coda in LHL (20%) 
final syllable 
2 NA Monosyllabic with HLH (75%) 
consonant /h/ as coda and HL (25%) 
toneme as onset 


Table 3/31: Independent Tone Rules associated with Consonant /h/ and Allotone 


Variations 


111 


3.3.2 Tone Variations Associated with Conjuncts of /f/ 
3.3.2.1 Monosyllabic Words 


S. Word Duration Fy Slope Contour of 
No. Text & tone 
IPA M F Avg | M | F | Avg| M | F | Avg M&F 
avg | avg avg | avg avg | avg 
1 Is /g6l/ 0.29 | 0.31 | 0.30 | 207 | 288 | 248 | 241 | 210 | 226 | LH(100%) 


Table 3/32: Contour of Tone in Mono-syllabic Words 


Discussion 
Rising tone is observed based on the single word examined as such words are 
infrequently used in the language. 


3.3.2.2 Tri/ Poly-syllabic words Conjunct containing /h/ in Medial Syllable 


S. Word Duration Fy Slope Contour of 
No. Text & tone 
IPA M F Avg | M F | Avg | M F | Avg M&F 
avg | avg avg | avg avg | avg 
1 “yet 0.07 | 0.06 | 0.07 | 238 | 300 | 269 | 284 | 489 | 387 | LH (100%) 
/k»amani/ 
2 uger 0.06 | 0.06 | 0.06 184 | 286 | 235 | 326 | 270 | 298 | LH (100%) 
/k»arava/ 
3 YaEt 0.08 | 0.08 | 0.08 | 236 | 312 | 274 | 168 | 306 | 237 | LH(100%) 
/k®oldna/ 
4 deat 0.05 | 0.06 | 0.06 | 224 | 315 | 270 | 286 | 326 | 324 | LH(100%) 
/k»ollava/ 
5 fies 0.08 | 0.08 | 0.08 | 230 | 311 | 271 | 201 | 257 | 229 | LH (50%) 
we LHL (50%) 
/sindna/ 
6 wast 0.02 | 0.02 | 0.02 | 205 | 298 | 252 | 836 | 733 | 785 | LH (30%) 
: LHL (70%) 
/gatakona/ 
7 noe 0.13 | 0.13 | 0.13 | 219 | 295 | 257 | 333 | 417 | 375 | HL (100%) 
/solaba/ 
8 BYH3z 0.13 | 0.12 | 0.13 | 224 | 291 | 258 | 318 | 364 | 341 | HL (100%) 
/tomatot/ 
9 uggst 0.23 | 0.23 | 0.23 | 247 | 294 | 271 | 195 | 263 | 229 | HL (100%) 
/potaona/ 
10 YUBrgIar 0.19 | 0.20 | 0.20 194 | 264 | 229 | 123 | 215 | 169 | HL (100%) 
/kbglarna/ 


Table 3/33: Contour of tone associated with Conjunct of /fi/ in Medial Syllable in Tri 
/ Poly-syllabic Words 
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It is observed from the above table: 


e The medial syllable containing short vowel and conjunct of /fi/ results in rising 


tone e.g. uyet /k»omdni/ Multicoloured yarn 


+ 12 Tested thee 
| Fae tet Quay View Sewct letrvt Seurdey Tee Soecture fie bntewty fareurt Pine 


06313 
‘Vesibie part 0.900113 seconds. 
Teta dsaton 0 9001 13 seconds 


Fig 3/26: Male Sample - LH Fig 3/27: Female Sample - LH 


e The medial syllable containing long vowel and conjunct of /f/ results in falling 


tone e.g. H&S /solaba/ Seepage 


“ttt 


eee ee 2d ot a oe — = Fay 
Fig 3/28: Male Sample - HL Fig 3/29: Female Sample - HL 


Discussion 


The rising / falling tone is observed incase conjunct of /f/ in medial syllable 
depending on whether the TBU is short / long vowel as in the case of tonemes in the 
same acoustic environment as discussed in section 3.1.3. Rising-falling allotone is 


observed in case of short vowel being TBU due to flap and nasal onset. 
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The observations are summed up below: 


S. No. Acoustic environment 


Tone observed 


Allotones 


I Tri / Poly-syllabic with 
conjunct of /f/ in medial 
syllable and short vowel as 
TBU 


Rising 


Rising-falling 


2 Tri / Poly-syllabic with 
conjunct of /f/ in medial 
syllable and long vowel as 
TBU or dipthong (long + 
short) 


Falling 


Table 3/34: Tone Rules (refer Data Table: 3/33) 


3.3.2.3 Di-syllabic Words Containing Conjunct of /f/ in Final Syllable 


S. Word Duration Fo Slope Contour of 
No. | Text & tone 
IPA | M | F | Avg |M] F /Avg] MJ] F |Avg| M&F 
avg | avg avg | avg avg | avg 
1 wes 0.10 | 0.13 | 0.12 | 233 | 320 | 277 | 169 | 325 | 247 | LH (60%) 
i LHL 
/galart/ (40%) 
2 we 0.11 | 0.13 | 0.12 | 235 | 324 | 280 | 208 | 264 | 236 | LH (70%) 
i LHL 
gomot/ (30%) 
3 fABE 0.11 | 0.12 | 0.12 | 236 | 315 | 276 | 161 | 210 | 186 | LH (50%) 
; LHL 
/d3rldn/ (50%) 
4 | om 0.22 | 0.20 | 0.21 | 206 | 322 | 264 | 123 | 182 | 153 | LH (50%) 
: LHL 
oe 
/thola/ (50%) 
5 ufgor 0.38 | 0.32 | 0.35 | 224 | 308 | 266 | 181 | 286 | 234 | LH (50%) 
f LHL 
/patia/ (50%) 
6 uae 0.47 | 0.41 | 0.44 | 198 | 295 | 247 | 158 | 299 | 229 | HLH 
i (100%) 
/parai/ 


Table 3/35: Contour of Tone in Di-syllabic Words containing Conjunct of /fi/ in Final 
Syllable 
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e LH is observed in words having final syllable in 50% of the speakers and LHL is 


observed in rest 50% e.g. WSF /galst/ Squirrel 


FM tects gaint 
Wee te Query Vow Sec tennal ouniey Ter ipmzum Pec btewty fomue hae 


| 1 


Fig 3/30: Male Sample - LHL Fig 3/31: Female Sample - LH 


Discussion 


Rising tone is observed in 50% cases and allotone rising-falling in rest 50%. In case 


of dipthong (long vowel + long vowel), falling-rising tone is observed. 


The observations are summed up below: 


S. No. Acoustic environment Tone observed Allotones 
1 Di-syllabic with conjunct Rising Rising-falling 
containing /f/ in final 
syllable or dipthong (short 
+ long vowel) 
2 Di-syllabic with conjunct Falling-rising - 


containing /f/ in final 
syllable followed by 
dipthong (long + long 


vowel) 


Table 3/36: Tone Rules (refer Data Table: 3/35) 
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3.3.2.4 Independent Tone Rules Associated with Conjunct of /f/ 


Based on the above discussions, the tone rules are summed up below: 


S. | Independent 
No. Tone 


Corroboration 
of tone 


Acoustic environment 


Observations on 
tone variations / 
Allotone 


1 Conjunct of 
/fi/ 


LH 


Monosyllabic with 
conjuncts of /f/ in final 
position 


Di-syllabic with conjuncts 
of /f/ in final syllable or 
dipthong (short + long) 


Open syllable with 
dipthong (long + long) 


LHL (50%) 


HLH (100%) 


Tri / Poly-syllabic with 
conjuncts of /fi/ in medial 
syllable containing short 
vowel 


Flap / nasal onset 


LHL (60%) 


HL 


Tri / Poly-syllabic with 
conjuncts of /fh/ in medial 
syllable containing long 
vowel or dipthong (long + 
short vowel) 


Table 3/37: Tone Rules associated with Conjunct of /f/ and Allotone Variations 
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3.3.3 Non-Tonal (NT) Occurrences of Consonant /h/ 


In section 3 discussed that /h/ in initial position is non-tonal 


following data was analysed to verify it. 


Sangha (2014). The 


S. Word IPA Meaning S. Word IPA Meaning 
No. transcription No transcription 
1 wear /hoka/ Sigh 1 JA /hasob/ law 
2 Jr /hogar/ Excretaof | 2 JdH /hazom/ Digested 
houseflies 
3 fore /hisab/ calculation | 3 foarfest | /hikarti/ Apologal 
4 foHTgSs /himat/1/ Himachal 4 Jéd /honot/ Art, skill, 
Pradesh a 
5 Jeo /holara/ Swing 5 JatH /hakom/ Ruler 
6 Pair= rh /hadzor/ Present 6 omy /hia/ courage 
7 ated /hitor/ Heater 7 Jaet /hukena/ To raise 
8 Ie /hura/ buffet 8 SAMS /hesiat/ Status 
9 Jeo /hevan/ uncivilized | 9 der /hotfra/ Mean 
person 
10 dd /hor/ More, else | 10 | GH@nte | /hoslamdd/ Patience 
11 da /hod3/ Water tank | 11 | neag /ahdkar/ Pride 
12 Wot /ghar/ Food, diet | 12 | nfina /ghisok/ Peaceful 
13 wifgarene | /ohisavad/ Doctrine 13 | feafsarg | /tehar/ Poster 
14 fefsgA (| /itehas/ History 14 | Hfas /sahit/ Literature 
15 Atos /saheb/ Master 15 | Afsge /sohird/ kind, 
gentle 
16 AIS" /suhela/ Soothing 16 | Hoes /fahadot/ Martyrdo 
m 
17 Hote /fahid/ Martyr 17 | wigg /ohoar/ Diseases 
18 whe /ehad/ Resolve 18 | wfaeerH | /chodnama/ Treaty 
19 wWoId /ahor/ Impulse 19 | nrger /ala/ Superior 
20 fegei /éna/ These 20 | feHfsae | /imtehan / Test 
21 AfsaSs™ /sesuba/ Naturally 21 | WOR /sahos/ Courage 
22 figsHe | /sehotomdd/ Healthy 22 | faggot /sehora/ Honour 
23 Addn /sohad3/ Grace 23, | AeEt /sohona/ Good 
looking 
24 Aa /sohaga/ Borax 24 | Afsgss /fehotut/ Mulburry 
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S. Word IPA Meaning S. Word IPA Meaning 

No. transcription No transcription 

25 Hfog /fehor/ City, town | 25 | ATQet /fahodi/ Testimon 
y 

26 Hdd /fohor/ Husband 26 | Hofer /sohatk/ Assistant 

27 Hofesr | /soharta/ support 27 | Haat /sohai/ Who 
provides 
help 

28 Hogar /soharna/ Bear 28 | Hor /sohara/ Support 

29 Hfor /sehad3/ Easy 

Table 3/38: Words with Non-Tonal Consonant /h/ 
Discussion 


The acoustic environment in which /h/ is non-tonal i.e. mono/ di/ tri/ poly-syllabic 
having consonant /h/ as onset in associated syllable. Some sample graphs are given 


below: 


Monosyllabic dd /hor/ More 


#16 Teetrc hoor #26 Teatind hor 


Fie fet Query View Select Intrret Boundiy Tier Spectrum ech Intensity Fermunt Putes 


Fle Et Query view Select inne Roundey Ter Specmum Pech Inenraty Formant Pues 
q 


a 


rr dull : 
_ Nn Jhb val hie 3] 


Fig 3/32: Male Sample - NT Fig 3/33: Female Sample - NT 
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Disyllabic Heute /fahid/ Martyr 


0.725408 
‘Visible part 0.821746 seconds 
Total duration 0.821746 seconds 


et fin |) of | tafe 


Fig 3/34: Male Sample - NT Fig 3/35: Female Sample - NT 


Trisyllabic fEHfSTS /mtehan/ Examination 


Gr Testi antehann 


Fle Gat Query View Select Iter Boundary. Tier Spectrum Pith ntenity formant Pubes 


1 Jal 


Tol 


W 


pw |) now | wf] bal) 


Fig 3/36: Male Sample - NT Fig 3/37: Female Sample - NT 
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Polysyllabic »faATE"E /ahisavad/ Doctrine 


+30 TertGnd wees 2p Ad Teesind atornaent 


Hae 130 Query View Select Intervat Boundary Tar Spectrum Pitch bntemuity fommant Pusees I) Ve (dt Query View Select Irterat Scurdey Tar Spectrum Pitch tntmaty formant Pubes 


Misible part 1.030726 seconds 
Total duration 1.030726 seconds 


Fig 3/38: Male Sample - NT Fig 3/39: Female Sample - NT 


3.3.4 Research Findings on Independent Tones 


The rising (LH) & falling (HL) tone as discussed in the literature survey (refer section 
3 Independent Tone) has been corroborated by and large other than some specific 


acoustic environments as discussed below: 


e ©The allotone rising-falling (LHL) has been observed in 50% of the speakers in 
monosyllabic words having consonant /h/ as coda, di-syllabic words having 
conjuncts of /f/ in final syllable and tri / poly-syllabic words having conjuncts of 
/f/ in medial syllable with flap / geminated nasal onset. 

e The allotone falling-rising (HLH) have been observed in following acoustic 
environments: 

a) Toneme / fricative onset in monosyllabic words having consonant /h/ as coda. 
b) Dipthong (long + long vowel) in open final syllable containing conjuncts of 
/fl. 


c) Composite word having first syllable having consonant /h/ as coda. 


e The Mono/ Di/ Tri/ Poly-syllabic words having consonant /h/ as onset in the 


initial syllable are found non-tonal. 
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Occurrence Acoustic environment Tonal variations 
of Consonant 
/h/ in 
Initial syllable | Monosyllabic with consonant /h/ and LH (50%), LHL (50%) 
conjunct containing /fi/ as coda 
e Fricative onset HLH (70%), LHL (30%) 
e Toneme as onset and /h/ as coda | HL (25%), HLH (75%) 
¢ Composite word (initial syllable | HLH (60%), NT (40%) 
with /h/ as coda) 
Di-syllabic with consonant /h/ as coda in LH (100%) 
initial syllable 
Medial Tri-syllabic with conjunct containing /f/ LH (100%) 
syllable and short vowel LH (40%), LHL (60%) 
e Flap / nasal onset 
Tri-syllabic with conjunct containing /f/ HL (100%) 
and long vowel or dipthong (long + short 
vowel) 
Final syllable | Di /Tri-syllabic with consonant /h/, LH (70%), LHL (30%) 


conjunct containing /fi/ as coda 
e Open syllable with dipthong (long 


+ long) 


HLH (100%) 


Table 3/39: Independent Tone Rules 
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3.4 Summary 


The historical origin of tone in Punjabi has been discussed in the Introduction 
section. The rules for contextual substitution of Tonemes orthographically by 
their voiced / voiceless unaspirated counter parts belonging to same group of 
consonants and the tone associated marking rules have also been discussed for 
Tones arising from Supra-Laryngeal Consonants. The nature of tones associated 
with consonant /h/ and conjuncts of /f/ has been deliberated under the 
Independent Tone section and rules have been summed up including non-tonal 
exceptions. 

The objective of experimental work carried out in this chapter was to corroborate 
the tone rules of Punjabi as collated through the literature survey. These rules 
have been experimentally verified and are applicable by and large. The detailed 
analysis has been presented in the chapter based on experimental observations 
which lead to discovery of allotones and findings on tone variations due to 
various co-articulatory factors. 

The presence of rising and falling tone as discussed in literature survey is attested 
as seen from the table below. The major research findings are the presence of 
allotones viz LHL, HLH for example: 

a) LHL (100%) : ear /dudia/ Milky white 


b) LH (70%); LHL (30%) : WSz/galdt/ Squirrel 

c) LH (50%); LHL (50%) : "J /va/ Wonderful 

d) LH (100%) , Tone shifts to nucleus of prior syllable : f6¥ /nigga/ Warm 

e) HLH (100%), Dipthong monosyllabic open vowel : wet /tai/ Two and a Half 
f) HLH (100%), Dipthong disyllabic final closed vowel : Y4"S /prodan/ Chief 
g) HLH (100%), Dipthong trisyllabic final open vowel : ust /parai/ Education 
h) HL (50%); HLH (50%) : fr /tian/ Attention 

i) HL (60%), HLH (40%) : Ward /tam d/ Rich Person 
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(long + long) 


Tone on vowel | Category of words Co-articulation parameters in a Tone 
of syllable (syllable under syllable variations 
under consideration) (percentage 
consideration of speakers) 
Monosyllabic Consonant /h/ as coda LH (50%); 
LHL (50%) 
Mono/di/ tri/poly- Toneme or conjunct containing LH (100%) 
syllabic (initial | conjuncts of /fi/ as coda 
syllable) 
Di/tri/poly-syllabic Toneme or conjunct containing /A/ | LH (100%) 
(medial syllable with | as onset 
LH short vowel as 
nucleus) 
Di/ tri/poly-syllabic Toneme as onset LH (100%) 
(final open syllable) Tone shifts 
to nucleus of 
prior syllable 
Tri/-syllabic (final | Dipthong (long + long) LHL (100%) 
open syllable) 
Di/ tri-syllabic (final | Consonant /h/ or conjunct LH (70%); 
closed syllable) containing /f/ as coda LHL (30%) 
Monosyllabic (closed | Toneme as onset 
syllable) Consonant /h/ as coda HL (100%) 
Any other consonant as coda HL (50%); 
HLH (50%) 
Monosyllabic (open | Dipthong HLH (100%) 
syllable) 
Di/ tri/poly-syllabic Toneme as onset. HL (100%) 
(initial syllable) 
HE Tri/poly-syllabic Toneme or conjunct containing /fi/ 
(medial open syllable | in the onset HL (100%) 
and long vowel as Dipthong (long + short vowel) 
nucleus) 
Di-syllabic Toneme as onset HL (100%) 
(final closed syllable) 
Toneme as coda HL (60%), 
HLH (40%) 
Dipthong (short + long) and (long + 
long) and flap / fricative / nasal HLH (100%) 
coda 
Tri-syllabic Consonant /h/ or conjunct 
(final open syllable) containing /f/ as coda with dipthong | HLH (100%) 


Table 3/40: Tone Marking Rules for Punjabi Language 


The sample data sheets and few reference graphs have been given at Appendix B & C 


123 


124 


Chapter 4 
Experimental Study of Lexical Stress 


4. Introduction 


Stress is a large topic which has been extensively studied for a very long time and 
still has many areas of disagreement. However, it is true that in all languages some 
syllables are in some sense stronger than other syllables. The difference between 
strong and weak syllables is of linguistic importance and in every language strong 
and weak syllables do not occur at random. It is observed that in all languages the 
words get distinguished by the position of strong syllable alone are comparatively 
few in number. Thus stress alone, without the accompaniment of some other 
distinguishing feature does not constitute a very effective means of differentiating 
words. The effort of pronouncing syllables with strong stress is clearly felt by the 
speaker but the resulting prominence is not always easily perceived by hearers Jones 


(1967-146). 


4.1 Syllabification 


How are syllable structures assigned to words? There are two main types of 
principles that determine syllabification in words in languages- one, universal 


syllabification principles and two, language-specifc syllabification principles. 
In most languages, syllabication in words follows the following generalizations: 


i. Each vowel is assigned to a syllable: 


E.g. me. ti. ni: ‘matinee’ go:. 1n ‘going’ 


ii. A consonant between two vowels goes with the following syllable. 


E.g. me. ti. ni: ‘matinee’ i:. try ‘eating’ 
ii. A final consonant goes with the preceding vowel. 


E.g. i:. try ‘eating’ 
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iv. Between two consonants, there is a syllable division. 
E.g. len.don ‘London’ 
v. Languages differ with regard to restrictions on syllable structures. 


a) Universal Syllabification Principles 
The two most important universal principles influencing syllabification are: Maximal 


Onset Principle (MOP) and Sonority Sequencing Principle (SSP). 


The Maximal Onset Principle (MOP) the syllabification of a sequence of consonants 
requires that they occur as onsets not as codas. Thus when there is a single consonant 
it is syllabified with the following vowel not with the preceding vowel in a majority 


of languages, for example, [mz.tr.ni:] ‘matinee’ not *[meet.m.i:]. 


The Sonority Sequencing Principle (SSP) requires that the sonority of a syllable 
increases from the centre to the edge of a syllable. The sonority scale is given below. 

Vowels or syllabic consonants- Glides- Liquids — Fricatives —Stops — Geminate stops 
The sonority scale could also be given with the least sonorous first and the most 


sonorous last. 


Thus a look at the sonority in the following English words- tend, great, swat, etc. In 
all these words, the sonority of the consonants increases towards the centre of the 
syllable. There are also violations of SSP. For example, in the word “pest” the SSP is 
working at the end, but not at the beginning in stop. The fricative /s/, which is more 
sonorous than /t/ is towards the edge of the word. 

b) Language-specific syllabification constraints 
A language-specific constraint on syllabification in English is the following: 


There is no syllable division between s+C, C+1/I/w/j and s+C+t/1/w/j 


Followng this constraint, the syllabification in the following English words is as 


follows: 


deprive: di.praiv replay: r1.pler equate: 1.kwaert inspect: in.spekt 
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The following syllabifications are unacceptable: * dip.rarv, *rp.lei, *1k.weit, 


*mns.pekt. 

Consonant Clusters in Punjabi: 

Two consonant clusters: 

Initial: 1) p/k/g/t/t + r; 2) s+; 3) p/t/k/k® +); 4) k/g +w 
Final: General 

Medial: General 

Three consonant clusters: 

Initial: Nil 

Final: Nil 


Medial: N+S+rf/[/r 


4.2 _ Linguistic Theories of Syllabic Structure 

4.2.1 Metrical Phonology 

The structure of the syllable as proposed in Selkirk (1982) is binary branching: 
o >Onset-Rime; Rime >Nucleus-Coda 


This structural representation allows the syllable to represent quantity, by separating 
the Onset from the Rime. Rime carries the weight of a syllable in languages in which 


weight plays an important role in word-stress. The structure is exemplified below: 


/™ 
On N On N Co 
/\ | 1 
1 | | | | \ | 
product pe ° d 2 k t 


Fig 4/1: Structure- Metric Phonology 
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In this representation a binary branching Rime represents a Heavy syllable; a non- 


branching Rime represents a Light syllable. 
4.2.2. Moraic Theory 
Hyman (1985), Hayes (1989) The binary branching structure can be alternatively 


represented in terms of a mora (u), which only shows the weight of the syllable, as 


below: 


o o 
On iN On R 

L, " wow 

| || 

bag b Fa g go g 2 U 


Fig 4/2: Structure- Moraic 


The phenomenon of word-stress in many languages, including Punjabi, involves the 
weight of the syllable, as we will see below. For Punjabi, as also for Hindi see Kelkar 
(1968), Pandey (1989), a three-degree classification is crucial: Light (a short vowel- 
V), Heavy (a long vowel (VV) or a short vowel followed by a consonant (VC) and 
Superheavy (a long vowel followed by a consonant (VVC) or a short vowel followed 


by two consonants (VCC)) 


Light syllable: nucleus Heavy syllable: nucleus 


Fig 4/3: Weight of Syllables 
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The above definition doesn’t characterise the consonant clusters occurring as onset / 
coda or dipthongs. Which result in three —tier syllables. 


For example: 


Segment Layer 


Rhyme Layer 
CV Layer ——— 


Fig 4/4: Syllable Definition 


Based on the syllable definition as discussed above & also in section 1.5.1, word 
categories in Punjabi based on the number of syllables in a word as discussed by 


Singh (1991) as given below: 


Monosyllabic Words: The words containing only one syllable. 


Vi 2M /a/, & /e/, & /o/ 
VC: fF /tkk/, MA /adzd3/, ES /odd/, MZ /or/ 
CV : Ut /pi/, 2 /de/, F/so/ 


CVC: 8 /kor/, US /per/, SS /nal/ 


Di-syllabic Words: The words containing two syllables 


VCV: 2e* /eda/, 8S /ure/, FSF /otd/ 
CVCV: Are /sara/, HET /mota/, StS /kita/ 
VCVC: 83d /vtar/, FEE /odon/, MATE /adzan/ 


CVV Sfmt /lara/, TYE /hora/ 
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CVCV: OSH /tfvati/, PT /guatfa/, frst /krari/ 


CVV: fret /lrai/, Het /kbuai/, HTS /k'vao/ 


Tri-syllabic Words: The words containing three syllables 


VCVCV: al'atl /agari/, GASH /usari/, SS" /tlava/ 
CVCVCV: Het /sovari/, HeU /mutapa/, Ir /kotfodgzd3i/ 


CVCVCVC: Had /sorehan/, SAMS /kurimar/, SSSA /tapebad3/ 


The frequency of disyllabic words is maximum and there are loan words borrowed 
from other Indo-Aryan languages. The vocabulary is mainly composed of 
“tadbhavas”, however the percentage of “tatsama” words is also on the rise. The 
vocabulary logs words in the domain of politics, science and technology. The 
morphological forms in Punjabi can’t be directly related to the parts-of- speech. New 
word forms are constructed by using pre-fixes and suffixes however no. of prefixes is 
much less than suffixes. Prefixes are mainly used in formation of adjectives. 
Compound words are also quite frequently used and there is reduplication also in 


their use. Some of the notable features of Punjabi are: 


1) Punjabi has abundance of masculine words. 

2) It is known as /a/ ending language as most of the nouns and many verbs and 
adjectives also follow this pattern. 

3) The gemination is a special feature of Punjabi among Indo-Aryan languages. 

4) It has lexically significant constractive tone. 


5) Nasalization is phonemic. 
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4.3 Word-Stress 
4.3.1 Phonetics of Word-Stress 


The phonetic definition of stress is one of the most difficult topics. 


According to Hayes (1995:5) says, “The definition of stress is one of the perennially 


debated and unsolved problems of phonetics ”’. 


Trask (1996.336) “Stress is invariably associated with greater loudness, higher pitch 
and greater duration, any of which may be more important in a given case, and 
sometimes also with vowel quality. Earlier attempts to identify stress with greater 
intensity of sound are now discredited, and current thinking holds that stress is 
primarily a matter of greater muscular efforts by the speaker and that hearers take 


advantage of several types of information to identify that effort”. 


Thus, phonetically it may be realized by any or a combination of any of the following 
features: extra breath force, vowel lengthening, loudness and pitch change. An 
example of stress realized as extra breathforce is the pronunciation of the word potato 
as [po 't'ertou]. The stressed syllable with an onset /t/ is aspirated, but the unaspirated 
syllable with an onset /t/ is unstressed. Stressed syllables are realized in most 
languages with the vowel longer. For example, in the word /a:ka:f/ the stressed 
second /a:/ is longer than the unstressed first /a:/. As we will see vowel lengthening of 
the stressed syllable and the complementary feature of vowel shortening in unstressed 
syllable is a prominent feature of stress in Punjabi. Stressed syllables are found to be 
louder than the unstressed syllables in most languages. Stressed syllables are 


perceived to bear change of pitch from Low to High in Hindi. 


The production of stress is generally believed to depend on the speaker using more 


muscular energy than is used for unstressed syllables. 


Measuring muscular effort is difficult, but it seems possible, according to 


experimental studies, that when we produce stressed syllables, the muscles that we 
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use to expel air from the lungs are more active, producing higher sub glottal pressure. 
It seems probable that similar things happen with muscles in other parts of our speech 
apparatus. Phonetically, stress is also employed to express emphasis. The phonetic 
correlate of stress is a combination of length and pitch. Unstressed syllables lack 


length and a high pitch. 


4.3.2 Phonology of Word-Stress 


The phonological account of word-stress has passed through several stages- 
Structuralist (e.g.), Generative Phonology by Chomsky & Halle (1968) and Metrical 
Phonology by Selkirk (1980), Hayes (1981) being the most prominent. Metrical 
Phonology (refer section 4.2.1) later came to be subsumed by Prosodic Phonology 
Selkirk (1984), Nespor & Vogel (1986). Our main concern here is with the metrical 
phonological approach to the study of stress. It would not be out of place to briefly 
discuss to the theory of Prosodic Phonology 


Prosodic Phonology proposes that the phonological structure consists of the 


hierarchical units as given below: 


Utterance (CU) 


Intonational Phrase dp) 


Phonological Phrase (q@) 


Phonological Word (aw) 


Mretrical Foot ©) 


Syllable (a) 


Fig 4/6: Stages of Phonological Stress 
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An Intonational Phrase has at least one nuclear tone. A Phonological Phrase has at 
least one phrasal accent, a word has at least one stress Foot, and a Foot has at least 


one Syllable. 


The theory of Prosodic Phonology claims that a unit at a certain level consists of the 
unit at the immediately lower level. Thus a word consists of at least one Foot, and a 
Foot consists of at least one Syllable. Given a word such as examination 


[1g,zeemi neifon], it has the following prosodic/ metrical structure: 


Oy oO, Oy oO, Oy 
Igzemi 'neifon 


Fig 4/7: Metrical Structure 


Metrical Tree Theory of Word-stress Hayes (1981) proposes that word-stress in 
languages can be represented in terms of relative prominence on the labelled tree 
structure. All branches are labeled either strong (s) or weak (w), where strength is the 


formalization of stress as shown below for the words differ and defer: 


iN \ 


‘differ de'fer 


Fig 4/8: Labelled Tree Structure 


This not only presents stress as relative prominence, but also explicates secondary 


and other level of stress. 
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Extrametricality: In the theory of metrical phonology, the notion of ‘extrametricality’ 


plays a crucial role in the assignment of stress in words. 


In languages, a syllable, mora, vowel or consonant may not be counted at the 
periphery, i.e. at the beginning or the end of the word. Hayes (1995) discusses the 
notion in full. The extrametrical constituent is shown with an angled bracket (<>), as 


in the representations of the words in Hindi below. 


Wd Wd Wd 

2 <> 2, <> r <d> 
0 OO oo 0 8 00 oO 

11h 1 1 lh lh h 

9 titi: pi po ri ja: si ta: ra: 


Fig 4/9: Notion of Extrametrical 


It is important to note here that in Punjabi, the final foot is extrametrical. 


4.4 Parameters of Stress 


Hayes (1981 & 1995) proposes the following parameters along which stress systems 


in world languages vary: 


Quantity-sensitivity: Quantity-sensitive (QS) vs. Quantity-insensitive (QI) 


Boundedness: Bounded vs. unbounded 
Dominance: Left- dominant (LD) vs. Right-dominant (RD) 
Directionality: Left-to-Right (LR) vs. Right-to-Left (RL) 
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These parameters are further illustrated by Gussenhoven and Jacobs (2001): 


Quantity-sensitivity: It refers to the difference in the quantity of syllables affecting 
stress. For example, in Hindi, the words [‘mohila:] ‘lady’ and [moa‘hi:na:] ‘month’ 
have identical syllable structure, with the difference in the middle syllables. The 


difference causes the stress pattern to differ in the two words. 


Boundedness: It refers to the difference in stress systems allowing a single stress at 
the edges in words of any length (undbounded) or words having stress within a 
number of syllables (bounded). In Punjabi, for instance, word-stress must be placed 


maximally until the ante-penultimate syllable. 


Dominance: It refers to languages allowing either the left branch or the right branch 
to be strong. For instance, in Adi, a language spoken in Arunachal Pradesh, the right 
branch has stress, whereas in Punjabi, it is the opposite. This difference can be 
illustrated with the help of the pronunciations of the English word “city” by an Adi 
speaker and a Punjabi speaker: [si'ti:] (Adi English), [‘siti] (Punjabi English). Adi is 
Right dominant (or RD) and Punjabi is Left-dominant (or LD). 


Directionality: It refers to the stress assignment going from one of the two directions- 
Left-to-Right (Ir) or Right-to-Left (rl). For instance, in Tamil, the first syllable of the 
word is stressed. The stress assignment in Tamil starts at the left. In Malayalam, 
however, stress begins at the right: Leaving out the final long vowel, the first long 
vowel from the right is stressed; e.g. [malo ‘ja:lom]; if there is no long vowel, then the 


first vowel is stressed ['kamolom] ‘lotus’. 


4.5 Rules for Assignment of Stress 


The rules for stress assignment are presented at two levels- at level of the Foot 


(containing a single stress) and the Word level (containing one or more stresses) 
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The following rule applies for English Hayes (1995) 
Foot level: | Raise LD, QS, bounded tree from right to left (LR) 
Word level: Raise a LD word-tree 


The word-tree gives us the primary and secondary stress in the presence of more than 


a foot in multi-syllabic word eg: ‘deva state. 


The rules yield the correct output in a majority of English words, subject to 


morphological structure of words. 


4.6 Review of Studies on Stress in World Languages 


Language in which meaning depends in any degree upon types of stress or upon the 

location of strong stress in sequences of syllables is termed as “stress languages.” 

They fall into three categories: 

(1) Those in which the the location of strong stress in words of more than one 
syllable is an integral part of the pronunciation of words. 

(ii) Those in which the use of special types of stress is an integral part of the 
pronunciation of words. 

(111) Those in which strong stress is used in sentences but do not have fixed positions 
in particular words known as intonation and isn’t discussed here as it is outside 


the scope of defined research problem. 


Stress languages of the first category are numerous. Among them are English, 
German, Russian, Spanish, Danish, Hungarian, Icelandic, Welsh, Greek, etc. In these 
languages a given word always, or generally, has strong stress on a particular 
syllable. Some of these words of more than one syllable may be differentiated by the 
position of the strongest stress. Stress is accentuations of syllables within words and 
this type of stress is known as lexical stress and fits into second category. The Indo- 


Aryan language falls into this category. 


136 


Stress functions only to point out the existence, at some point in the utterance, of a 
significant unit carrying the amount of information which is expected from a lexical 
unit. In lexical- stress languages, the syllables of any polysyllabic word are not 
created equal. Some syllables may serve as the focus of accentual prominence; others 
may not. Perceptually, this results in a distinction between the syllables within a 


word. 


According to M. Ohala (1977) “Stress involves morpho-syntactically conditioned 
intonational difference rather than lexically marked accentual differences”, 
Languages may differ as to the place of stress in a word. 

Proto-Indo-European (PIE) is the linguistic reconstruction of the hypothetical 
common ancestor of the Indo-European languages, the most widely spoken language 


family in the world. 


Proto-Indo-European and Other Lang 


INDO-IRANIAN HELLENIC ITALIC BALTO-SALVIC GERMANIC OTH-LANG 


ndic Iranian Greek Latin Polish Russian North Germanic Austro-Tai 
Italian 


™ 


Avestan Old Persian 


’ 
Austro-Kadai 
wedish | 
|Austro-Japanese 


Middle Persian French Romanian Norwegian West 


ie aki Spanish 
Anglo- Frisian Old Dutch Old High German 
™a \ 
Old English Old Frisian = Middle Dutch 
Middle English Fnsian Dutch German Yiddish 


Modern English 


Fig 4/10: Proto-Indo-European Languages Representation 
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4.6.1 European Languages 


The ancient Greeks studied no language but their own; they took it for granted that 
the structure of their language embodied the universal forms of human thought or, 
perhaps, of the cosmic order. Accordingly, they made grammatical observations, but 
confined these to one language and stated them in philosophical form. They 
discovered the part of speech of their language, its syntactic constructions, such as, 
especially, that of subject and predicate, and its chief inflectional categories: genders, 
numbers, cases, persons, tenses and modes. They defined these not in terms of 
recognizable linguistic forms, but in abstract terms which were to tell the meaning of 
the linguistic class. These teachings appear most fully in the grammar of dyscolus 


Thrax (Second Century B.C) and of Apollonius Dyscolus (Second Century A.D). 


Greek is a language with lexical stress that marks stress orthographically with a 
special diacritic. Thus, the orthography and the lexicon constitute potential sources of 
stress assignment information in addition to any possible general default metrical 
pattern. In Greek spelling, contemporary rules dictate that every word with more than 
one syllable must bear a stress diacritic on the vowel of its stressed syllable 
Petrounias (2002). Greek words with two or more syllables written without a stress 
diacritic are thus considered misspelled, even though stress assignment can usually be 
guessed successfully from the phoneme sequence Protopapas (2006). Extending and 
complementing previous studies in Italian and Spanish, Greek allows investigation of 
stress assignment free from the structural (Phonological) constraints that interact with 


default placement in those languages. 


Phonological changes concern the segmental and supra-segmental characteristics. 
Changes in both aspects are bounded through production and perception. Expending 
more muscular energy in the articulatory movements for making a syllable more 
prominent influences timing which may possibly result in different vocal tract 
configurations in stressed versus unstressed vowels. This may cause more or less 
perceptible changes in time, leading eventually to a different set of syllables 


depending on position in the word. 
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When an unstressed syllable is emphasised, jaw lowers and the short vowels tend to 
be perceived and perceived as more open (short i>e; short u>o; short e> ¢, short o> O, 
etc.). 

Classical Latin is considered to have a melodic accent on the penultimate or the ante- 
penultimate Roudet (1910). The prosodic anchor point for the pitch-accent was the 
penultimate syllable, if the penultimate was a heavy syllable, i.e. a closed syllable 
(“syllable entravée’’), ending by a consonant (mam), ending by a consonant (amantem 
> amant[ amA$] 'loving') or had a long vowel (farina > farine '[faAin] 'flour' , amatus 
> aimé [Eme]'loved'), and on the ante-penultimate on the other cases (asinus > asne> 
ane[An]'donkey', fragile > fréle [frE]] 'frail'). French descends from Vulgar latin, e.g. 
the Latin spoken by the soldiers, the merchants, the immigrants after the roman 
conquest. 

Stress can be on the first or penultimate or final syllable as in Czech, polish and 
French respectively. Similarly there can even be complicated stress on penultimate 
syllable if it is long and on the third syllable from the end if the penultimate syllable 
is short in classic Latin. In French the distribution of stress serves only as a kind of 
gesture: ordinarily the end of a phrase is louder than the rest; sometimes, in emphatic 
speech, some other syllable is especially loud; often enough one hears a long 
succession of syllables with very little fluctuation of stress. In languages such as 
Italian, Spanish, the selvic languages etc. the stress characterizes combination of 
linguistic forms; the typical case is the use of one high stress on each word in the 
phrase, with certain unstressed or low stressed words as exceptions. Thus there are 
differences in the manner of applying stress among stress using languages. 

In English, the prominence results from the pitch movement and gives the strongest 
type of stress. The stress in English is either primary’/ or secondary /,/  . Primary 
stress is stronger than secondary stress and there may be some syllables which are 
unstressed e.g: Photographic- / fouta' greefik/. Stress placement in a word in some 
cases divides its function as a noun/verb hence it is called functional stress e.g. in 
word delegate 


[ ‘dela get] verb; ['delogot] noun 
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In order to decide on stress placement, it is necessary to make use of some or all of 

the following information: 

1. Whether the word morphology is simple, or whether it is complex as a result of 
either containing one or more affixes (that is, prefixes or suffixes) or of being a 
compound word. 

2. The grammatical category to which the word belongs (noun, verb, adjective, etc.). 

3. The number of syllables in the word. 

4. The phonological structure of those syllables. 


4.6.2 Indo-Iranian Languages 


The rapid initial expansion of Islam in the seventh century brought Arabic as the 
sacred language of the Quran to all the vast territories of the Caliphate, but as a 
spoken language only to the Middle East and North Africa. In the eastern lands of 
Iran and Central Asia, Persian continued to be spoken and soon evolved as a literary 
language also. This classical Persian, the most prominent representative of the Iranian 
languages which are quite closely related to Indo-Aryan retained its Indo-European 
structure and basic vocabulary but incorporated a huge number of loan-words from 
Arabic and was written in the Arabic script. Persian language was also known as 


“Farsi” an Arabic adaptation of the word “Parsi”. 


Chodzko (1852), was the first person to discuss stress in Persian. He identified the 
basic rule that stress is word final in simple, derived & compound nouns and 
adjectives, and nominal verbs. As to verbal stress, he has different rules for different 
tenses. Another researcher Mahootian (1997), explained stress point of Persian 
language: stress is word-final in simple nouns, derived nouns, compound nouns, 
simple adjectives, derived adjectives, infinitives, and the comparative and superlative 
forms of adjectives as well as in nouns with plural suffixes, and mentions verbal 


stress as one of the exceptions to this rule. 
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Vahid Sadeghi (2011), discussed the Persian stress pattern by examining the acoustic 
correlates i.e. duration & intensity and concluded that the majority of lexical words in 
Persian are stressed on final syllable. Word-final, stress pattern applies to nouns, 


adjectives, most adverbs & simple verbs. 


The phonological literature typically describes Arabic stress as predictably falling on 
a particular location in the word, depending on the internal structure of the syllables 
making up the word. The pattern of stress location varies considerably in colloquial 
and modern renditions of classical Arabic Jong & Zawaydeh (1998). The general 
pattern of stress placement in Arabic is that the last heavy syllable is typically 
stressed. Here heavy is a term grouping syllables which are closed and open syllables 
which contain a long vowel. If there are no heavy syllables in a word, then stress falls 


in some other predictable location. 


4.6.3 Lexical Stress due to Gemination in Japanese and Italian 


Gemination of consonants as a distinctive feature occurs in some languages however 
it is subject to various phonological constraints depending on the language. 
Languages such as English and Spanish do not have geminates. Japanese and Italian 


geminates are exemplified by the minimal pairs as given: 


1. Japanese geminate contrast (Tsujimura 2007) 
a.[saka] ‘hill’ 
b. [sakka] ‘author’ 
2. Italian geminate contrast 
a. [fato] ‘fate’ 
b. [fatto] ‘fact’ 


Leben (1980), posited an autosegmental representation of geminates in which a single 
phoneme is linked to two slots on a skeletal tier that encodes the prosody of the word. 
This skeletal tier is also referred to as a CV-tier, an X-tier, or a length tier depending 


on the specific conception of the researcher. 
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Important earlier works that incorporate a CV-tier include McCarthy (1979 & 1981), 
Halle and Vergnaud (1980), Clements and Keyser (1983) and Hayes (1986). 
Geminate representation on this view is exemplified by the geminate [kk] of the 


Japenese word in (1b). 


. CV tier representation 


Tc <¢¥ 


LIVI 


Fig 4/11: Skeletal Tier (CV -tier representation) — [sakka] 


Languages with geminates vary considerably with respect to the durational difference 
between the geminate and its singleton counterparts. The Ratio may vary from 3:1 (in 
Japanese to 1.8:1 ratio for Italian) Idemaru and Guion (2008). Thus geminate 
consonants are transcribed by a sequence of two identical letters in orthographic 
representation. The phenomenon of pronouncing geminated consonants leads to 


stress. 


4.7 Lexical Stress in Indo-Aryan Languages 


The history of the easternmost branch of the Indo-European language family, known 
as Indo-Aryan, dated back at least three thousand years to the earliest hymns of the 
Rigveda, the most ancient of the sacred texts of Hinduism. When the natural 
processes of linguistics change threatened to corrupt the sacred vedic texts and 
thereby sap their ritual power, the world’s first linguists emerged from the ranks of 
the Brahmins to codify and thereby artificially preserve their language. This process 
reached its culmination in the grammar of Panini (4thc.B.C.), which fixed Old Indo- 


Aryan in the stage of “Classical Sanskrit’. 
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4.7.1 Hindi— Urdu Languages 


Thus Hindi and Urdu can be described as being ultimately descended from Sanskrit, 
near relatives of such contemporary New Indo-Aryan languages as Panjabi or 
Bengali, quite closely related to the next languages of the vast Indo-European family 
(refer Fig 4/10), such as Persian and still more distantly connected to languages such 
as English and Portuguese belonging to remoter branches. Such relationships can be 
objectively demonstrated by reference to shared grammatical structures or to 
etymologically shared vocabulary e.g. Hindi-Urdu /ma/, Sanskrit /mator/, Persian 
/mador/, English /mothar/. 


Husain (1997) discussed that stress falls on the right most heavy syllable in the word. 
If there is no heavy syllable, stress falls on penultimate syllable. Word final segments 


are extrametrical (invisible to the stress rules). 


Halpern (2009) presents five stress rules which govern word-level stress patterns in 
Modern Standard Arabic. First, stress always falls on the ultimate syllable, should the 
ultimate syllable be superheavy e.g. in the word /dga.di:d’/ meaning ‘new’, the stress 
falls on the final syllable as it is superheavy. This rule takes precedent over all others. 
Second, in monosyllabic words, stress always falls on the ultimate syllable. Though 
this seems obvious, it is necessary to remember that words which contain proclitics 
are considered monosyllabic, and thus the ultimate syllable must be stressed. This is 
important because in a disyllabic word with a proclitic, if the proclitic was considered 
in applying stress, the stress rules would dictate that the stress be penultimate instead 
of ultimate. For example, the word /bi.kdnmeaning ‘how much’, contains the 

proclitic /bi/ and thus the ultimate syllable is stressed rather than the penultimate. 
Third, the stress in disyllabic words falls on the pentultimate syllable, regardless of 
syllable weight, should the word be lacking a superheavy syllable. This pattern can be 
seen in the word /kta: 1b/ meaning ‘writer’, in which the stress falls on the 
penultimate syllable because the final syllable is not superheavy. Fourth, stress falls 


on the pentultimate syllable in polysyllabic words if that syllable is heavy. 
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The word /fu.ku:’.ma/ meaning ‘government’, has stress on the penultimate syllable 
because it is a heavy syllable. Finally, if the penultimate syllable is light in a 
polysyllabic word, then the stress falls on the antepenultimate syllable. The verb 
/ka’.ta.ba/ meaning ‘write’ demonstrates this pattern, with stress falling on the 
antepenultimate syllable because the following penultimate syllable is light Erica 
Lauren Shifflet (2011). Halpern (2009) mentions a few points which are necessary in 
understanding how to apply these stress rules. As previously stated, though words 
with proclitics are technically disyllabic or more, the proclitic is ignored when 
applying stress. As a result, words with proclitics that are disyllabic are treated as 
monosyllabic and polysyllabic words with three syllables are treated as disyllabic, 
with regards to the stress pattern rules. Examining how a word is actually pronounced 
is also important with Modern Standard Arabic (MSA) because in formal situations 
words are pronounced with case endings. When these case endings are excluded and 
not pronounced, the stress pattern of the world will change, as will the number of 
syllables in the word. For example, the word /dga.di:d’/ meaning ‘new’ has final 
stress as mentioned above. However, with a formal marking of case, the stress moves 
to the penultimate syllable because a new syllable is added to the end of the original 
word. Therefore the word becomes /dga.di:’.dun/ with stress on the heavy penultimate 


syllable, which follows the rule for polysyllabic words. 


4.7.2 Punjabi Language 


Assignment of stress in Punjabi is entirely predictable, yet it patterns differently in 
disyllabic and trisyllabic words. Optimality Theory provides a unified system in 
which both disyllabic and trisyllabic words can be handled under a single ranking 
using typologically attested constraints. Dhillon (2007) presented Optimality 
Theoretic analysis of Punjabi stress as well as a brief exploration of Hindi, Sindhi, 
and Urban Hijazi Arabic- three languages with stress systems similar to that of 
Punjabi. Punjabi exhibits a three-way distinction in syllable weight with monomoraic 


light syllables, bimoraic heavy syllables and trimoraic superheavy syllables. 
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Secondary stress is not found in Punjabi and main stress is not contrastive except for 
few minimal word pairs. Stress is also not affected by morphology. In the verb forms, 
the addition of a suffix to the verb stem does not alter stress placement nor does the 
addition of the plural suffix alter stress placement for the nominal forms. Stress in 
Punjabi is distributed solely according to a pattern based on the syllables present in a 
word, the same phenomenon is evident in Hindi Hayes (1995), Pandy (1989), Kelkar 
(1968) and Sindhi Walker (1997)- two Indo-Aryan Languages closely related to 
Punjabi. Sangha (2014) discussed although Punjabi is not a stress language like 
English however in many words the change in stress position is lexically significant 
and sometimes may result in change of POS category. Stress is a multidimensional 


suprasegmental feature of Punjabi. 


Nara (2016) carried out study of 85 words on stress and tone analysis for Doabi 
dialect of Punjabi using mixed effects model of stress. He also reported only primary 


stress. The stress analysis by him is briefly given here: 


e The syllable with longest rhyme is stressed e.g. 
fase /kr'tab/ viz book 


afowre/bu' niad/ viz foundation 


e If there is no long rhyme then the penultimate syllable is stressed e.g. 


GSH /vu'nddza/ viz forty nine 


e Singleton coda consonants do not contribute to the weight of the rhyme 


e.g. 
{gad /‘p'kor/viz. worry 


fade /'kiron/ viz. ray 
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e Homorganic Nasal stops and geminate consonants do contribute to the 


weight of the rhyme: 
WS /o'ndd/ “happiness” 


UAE /p'sdd/ “Preference” 
ufesTS /po'vittorta/ “Purity” 
UY /por'tokk"/ “Obvious” 


Pitch starts out low on the stressed syllable and rises through the syllable boundary so 
that it is the syllable following the stressed syllable that has the highest pitch. A 
phonemically long vowel may become a phonemically short vowel when the syllable 
within which it occurs loses its stress e.g. ba’taa (To tell) viz.’ baat (utterance). Thus 
stress falls on highest sonority syllable. 

Stress placement in Punjabi is determined by syllable structure and morphology 


similar to Hindi - Urdu stress placement. The stress bearing syllable carries high tone. 


4.8 Stress Patterns of Punjabi 


Stress is not a prominent feature of Punjabi, however it is utilized in di-syllabic words 
to distinguish between grammatical categories, known as functional stress. In the 
noun category stress falls on the initial syllable and in the verb category stress falls on 
the final syllable. In gemination, stress falls on the geminated consonant and it 
additionally co-occurs with tone in tonal words. The acoustic characterization of the 
properties can help in identifying the stressed syllable from other unstressed syllables 


in a word. 


4.8.1 Functional Stress 


Stress can be used to establish a distinction in meaning between two words, where the 
only difference is with regards to the placement of stress. Such stress is known as 
functional stress. English has functional stress. Punjabi also exhibits functional stress 


in a very small set of words. 
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Stress placement in these words divides their function as noun / verb / adjective etc 


i.e. the POS changes with the change in the position of stress e.g. 


Sr No Noun Verb 
1. JT /‘ho ra/ (Green colour) JT /ho'ra:/ (Defeat) 
2. 3T /'pora/ (Brother) 3ST /poa'ra:/ (Filling) 
3. He /'so pha/ (Page of book) He /so ‘pha:/ (Clean) 
4. IS /' gola/ (Throat) SIS /ga'la:/ (Cause to melt) 
5. 3S /‘tala/ (Sole) 3S /ta'la:/ (Cause to fry) 


Table 4/1: Pairs of Functional Words 


It is noted that there is also an alternation of vowel quality depending on the position 
of stress. The last vowel gets elongated when second syllable is stressed, it is also 


referred as prolative vowel. 


4.8.2 Stress due to Gemination 


Gemination in Punjabi is phonemic. The minimal pairs (non- nasal and nasal) are 


given below: 


Non —Geminate Geminate 
Word IPA Meaning | Word | IPA | Meaning 
us pot Honour | uy pott Leaf 
HAS sot Essence | AZ satt Seven 
fae | d3in Who fa = ‘| dzin Devil 


Table 4/2: Minimal Pairs (non- nasal and nasal) 


Orthographically gemination is represented by double consonants and such 
consonants occur in medial & final position only. These are preceded by short vowels 


/d, 1, O/. 
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For example, the geminate clusters are written by the sign ™f known as /addak/. The 


consonantal segments /n /, //, /r /,/ 1 /, /h/ and /j/ do not occur as geminates. 


Geminates of /& /p*/, 4 /t"/, S /t*/, & /tf*/, H /k*/ aspirate only at the final release in a 


geminated word. They are phonetically similar to a cluster of an unaspirated stop and 
homorganic aspirate. There is, however, no structural reason to consider such 


geminates as different from others. S.S.Sangha (2014) 


Word IPA 
i /dzappta/ 


/hatthi/ 


Tkotf*i/ 
Fwotftfri/ 
Tkakkta/ 


| Gt) ob) af] a 


Table 4/3: Examples- Geminate Aspirates 
The geminate cluster can be within the same cluster e.g. AS /koll/ or can go across 


the syllable e.g. A&T /kolli/ as illustrated below: 


wy WY at 
7 : /\ ; J/~\ 
—s SS | 
de fens 
aS /koll/ AST /kolli/ 


Phonetically, the duration of double consonant becomes 1.5 to 2 times longer than the 
non-geminate consonant, thus leading to the increased duration of the syllable of 
which geminate consonant is the part. Such syllable becomes stressed as compared to 


other syllables in a word. 
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Phonotactics 


(1) Diphthongs do not occur in word initial position. 

(11) Short vowels don’t occur in word final position. 

(iii) Short vowels /I/ & /U/ generally occur in word initial position. 

(iv) The vowels /e/ & /9/ generally do not occur in word final position 

except for some monosyllabic words. 

(v) Words don’t begin with these consonants /®, ©, ©, 7, 3/-/n, pint l/ 

(vi) | There is abundance in use of vowel diphthongs in the end of the words 

including three or more than three vowels. 

The generalization about where stress occurs, can only be made in reference to 
syllable types for the study of stress patterns. These types are usually described in 
terms of weight of syllables categorized into types such as light, heavy or super 
heavy. The following syllable definitions have been followed for the purpose carrying 
out the experimental study of intra-syllabic stress the definitions of light (L), heavy 
(H) and super- heavy (SH) syllable have been evolved by Slata et al (2015). 


Light Syllable (L) 


(1) Open syllable containing a class I voweli.e. Vi or CV; 


Heavy Syllable (H) 


(1) Open syllable containing a class II vowel or a diphthong viz. V2, CV2, Va, 
CVa 


(11) Any syllable having class I vowel with a coda or onset & coda viz. 
ViC (C), CViC 
Super Heavy Syllable (SH) 


(1) Class II vowel or a dipthong followed by one or more consonants viz. 
V2C(C) , VaC(C) 
(11) Class II vowel or a dipthong having onset as well as coda viz. CV2C, 


(C)CV2C, CVaC, (C)CVaC 


(11) Class I vowel followed by two or more consonants viz. CV;CC 


Table 4/4: Syllable Definitions for Experimental Work 
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4.9 Experimental Study 
4.9.1 Articulatory Features for Determining Syllabic Stress 


Co-articulation is a phenomenon in which the articulatory movements required for a 
syllable are often anticipated (anticipatory Co-articulation) or carried over (carry over 
Co- articulation) during the production of an adjacent syllable. Stress plays an 
important role in this. It depends on: quality of syllable peak, openness or closeness 
of the syllable, type of syllable margin, position of the syllable in the word under 


consideration, presence of germination/tone etc. 


Syllable peaks and syllable margins show considerable reduction of quantity, quality, 
and intensity and pitch when occurring in weak position of a syllable whereas there is 
an all around rise in a stronger syllable. Reduction in quality of the initial syllable in 
disyllabic words is a common feature. Sharma (1971) discusses syllabic structure of 
Punjabi in detail with reference to the variations in quality of syllable peaks, vowel 


reduction, schwa deletion etc. as is evident from following examples: 


S. Description Example word IPA Meaning 
No. 
1. | Preference for class I WidT /ogg/ Fire 
syllable peaks = 
gS /p*oll/ Flower 
Bur /tikkta/ Sharp 
2. | Reduction / centralization | gyyg > gag | /bazar/ > /bozar/ Market 


of first syllable to neutral 
schwa when second 
syllable is heavy and 
closed 


3. | Reduction of class II aet /kovi'/ Poet 
(phonemically shorter 


duration of vowel) fags /karpalu’/ Kind 
syllable peak in the final 
syllable of di/ploy- 
syllabic words 


Contd.. 
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S. Description Example word IPA Meaning 


4. | Occurrence of class I fa /ka/ That 
syllable peak in final : 
open syllable only in gq Ikol Approximately 
mono-syllabic function 
words 

5. | Preference for Ha /nni/ Rain 
nasalization with long = Tal Y 
and open syllable peaks 3 Hu ee 

6. | Regressive nasalization asm /kodia/ Girls 

f syllabl k - 
Mersey Brat /ava/ May take 

7. | Schwa deletion Haat /socka/ Roads 

aaa /kagza/ Papers 


Table 4/5: Variations in Quality of Syllable Peaks & Vowel Reduction in Punjabi 


Crystal (1997) further describes syllables by their position within the word e.g. in a 
tri-syllabic word, final syllable is referred to as the ultimate syllable, while the second 
to the last syllable is the penultimate syllable and the third to final syllable is the 
antepenultimate syllable. All of these placements are determined beginning from the 


rightmost edge of the word, which would be the ultimate syllable. 


Lexical Stress in terms of Intra-syllabic stress needs to be examined to aid the 


prosodic PLS development in Punjabi. 


4.9.2 Empirical Research 


"Empirical" means "based on observation or experience," according to the Merriam- 
Webster Dictionary. Empirical research is a research using empirical evidence. It is a 
way of gaining knowledge by means of direct or indirect observation or experience. 
Empirical evidence can be analyzed quantitatively or qualitatively. 

Empirical analysis is an evidence-based approach to the study and interpretation of 
information. Empirical analysis is integral to the scientific method and is the usual 
approach used to study subjects for a probable answer through quantified 


observations of empirical evidence. 
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Scientific method begins with scientists forming hypotheses and then acquiring the 
knowledge through observations and experiments to either support or disprove a 
specific theory. The scientific method often involves lab experiments and these 
experiments result in quantitative data in the form of numbers and statistics. The role 
of empirical study is to develop a general hypotheses which relies upon the capacity 
to characterize computational models as far as sets of features that can be utilized to 
make and evaluate predictions about what influences the conduct under investigation 


Cohen (1995), Sparck-Jones and Galliers (1996), Walker (1996). 


4.9.3 Acoustic Parameters 


The stress measurement parameters as discussed in section 4.3.1 viz. Pitch (P), 
Duration (D) and Intensity (I) of the syllables in a word form the hypothesis for 
determining the Intra-syllabic stress experimentally. As per literature study, intensity, 
fundamental frequency (Pitch) and duration of vowels is greater within stressed 
syllables. Though data is still lacking to establish definite correlates, however Erica 
Lauren Shifflett (2011) says fundamental frequency, intensity and duration can be 
used as phonetic correlate measurements to determine the stress pattern of a 
language. Thus a systematic approach needs to be taken to measure these parameters 
of Punjabi word samples. Di-syllabic words have highest frequency of occurrence in 
Punjabi. The frequency analysis also reveals presence of 10-15% tonal words. 
Therefore the empirical study is based on this premise and to start with, non-tonal 
words will be taken as basis for determining the inter-relationship of these parameters 


in the context of stress. 


4.10 Methodology 
4.10.1 Data Selection, Recording and Annotation 


The phonologically rich words in various combinations of syllables as per definition 
given in Table: 4/4 are being considered for analysis for 10 speakers (4 Male & 6 


Female). 
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The distribution of data across various catogries of words is given in the table below: 


S. No. Word category Total words 
1. Di-syllabic 185 
2. Tri-syllabic 86 
a Poly-syllabic 12 
Total 283 


Table 4/6: Data Samples for Study of Lexical Stress 


Recording of data done as per specifications as discussed in section 1.8.2. Data 
annotation using PRAAT tool was done at the phoneme level as per procedure 
discussed in section 3.1.2. The syllables in each word were also marked. A sample 


annotation is depicted below: 


0.711603 0.248884 (4.018 /s) 


Fig 4/12: Sample Annotation (syllable & phoneme level) 
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4.10.2 Recording of Data Sheets 


The syllable level annotated data of above samples was used for recording various 


acoustic parameters as discussed in section 4.9.3. A sample data sheet is given below: 


Ade /sorab/ 
Speakers Sybl Syb2 

D P I D P I 
M1 (S8) 0.13 135 Ti 0.18 165 70 
M2(S13) 0.2 200 71 0.21 192 66 
M3 (S5) 0.29 188 63 0.22 186 65 
M4 (S7) 0.2 171 65 0.16 176 67 
Fl (S1) 0.27 241 67 0.25 205 64 
F2 (S2) 0.23 241 63 0.24 248 62 
F3 (S3) 0.4 239 63 0.22 227 65 
F4 (S11) 0.27 305 70 0.31 357 71 
F5 (S6) 0.26 241 62 0.23 248 62 
F6 (S9) 0.32 246 72 0.25 261 72 
Average 0.26 | 220.70 | 66.70 | 0.23 | 226.50 | 66.40 


Table 4/7: Sample of Syllable Level Data of different Acoustic Features 


4.10.3 Linear Regression Analysis 


Linear regression is a linear approach to modelling the relationship between a scalar 
response (or dependent variable) and one or more explanatory variables (or 
independent variables). Linear regression is a basic and commonly used type of 


predictive analysis. 
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These regression estimates are used to explain the relationship between one 
dependent variable and one or more independent variables. Linear regression is very 
extensible and can be used to capture non-linear effects. There are typically a small 
number of coefficients. If we have a small number of features that are important, it 


predicts future data quite well in a lot of cases, despite its simplicity. 


The standard deviation (represented by the Greek letter sigma o) is a measure that is 
used to quantify the amount of variation or dispersion of a set of data values. A low 
standard deviation indicates that the data points tend to be close to the mean of the 
set, while a high standard deviation indicates that the data points are spread out over a 
wider range of values. In statistics, a confidence interval (CI) is a type of interval 
estimate, computed from the statistics of the observed data. The confidence 
level represents the frequency (i.e. the proportion) of possible confidence intervals 
that contain the true value of the unknown population parameter. In other words, if 
confidence intervals are constructed using a given confidence level from an infinite 
number of independent sample statistics, the proportion of those intervals that contain 
the true value of the parameter will be equal to the confidence level. Most commonly 
the 95% confidence level is used. However, other confidence levels can be used in 


the range of 90% - 99%. 


In two-dimensional linear regression, the general form for a model is a distribution 
concentrated along a line. A line is determined by two parameters — its slope and it y- 
intercept — and we want to find the parameters that determine the best fit line for a 
given set of points. We know that the data points probably won’t all fall right on any 
one line, so there will always be some error. For any given line, we can define a 
distribution that is equal to one along the line and decreases as we move away from 
the line. In particular, the probability will be defined by the Gaussian function 
ety) /2 6" where d is the distance, so that as we move away from the line, the 


probability will follow a bell curve. 
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Standard deviation 
Fig 4/13: Graph of the Distribution Function 


The properties of a normal distribution are: 
e The mean is at the middle (50% of the data are above and 50% of 
the data are below) 
e 68% of the data fall between -1 and | standard deviation 
e 95% of the data fall between -2 and 2 standard deviation 


e 99.7% of the data fall between -3 and 3 standard deviation 


The acoustic parameters as discussed in section 4.9.3 will be modelled using 
Linear Regression for relational analysis. Linear regression is a standard 
mathematical technique which has been used to predict the intra-syllabic 
stress in percentage i.e. the heaviest syllable will be identified for marking the 
stress. The duration, pitch and intensity of both the syllables in a word 
averaged across 10 speakers for 95 words was tabulated in a spreadsheet and 
was plotted using Curve Expert Professional, a cross-platform solution for 


curve fitting. 
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A sample spreadsheet and graph-plot for duration of first syllable of all 
sample words is given below. Similar plots were made for Pitch and Intensity 


for both the syllables in each word. 


File Edit Data Calculate Tools Window Help 


Teo eR DO WARE Aue 


Notes. | Data Plot | Top Results | + 


Y 


0,060000 
0.070000 
0.070000 


Welcome to CurveExpert Professional. 

{All fonts now available. 

See "Activate Trial Period” in the Help menu to activate your unlimited 30-day trial. 
new version of CurveExpert Professional is available. Your current version is 2.6.3, and 

te latest version is 2.6.5. 
itoscaling graph Top Results... 

(Autoscaling graph Preview... 

(Autoscaling graph Data Plot... 

(Autoscaling graph Top Results... 

(Autoscaling graph Preview... 

lAutoscaling graph Data Plot... 


Fig 4/14: Plot for Duration 
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Tee On ay ee a 
i‘ Eda Oste Calculate Took Window Help : . — : 
DOB S2io eX cI AAREALIO 
Rests “Ox Graphs snd Oem 


Name Kind Score | Date | Netes | Date Piee Top Results . 


Jay Precewne Linear Reqresnon = 97S a ° (>) + Z ne x ¢ om 


Top Results 


Reet Meven “CX Messages “ox 


Fig 4/15: Piecewise Linear Curve Fitting 


The process of finding the best fit piecewise linear curve was automated using this 
tool. Linear regression attempts to model the relationship between two variables by 
fitting a linear equation to observed data. A linear regression line has an equation of 
the form Y = ax + b, where X is the explanatory variable and Y is the dependent 
variable. Using Nonlinear model, the piecewise linear curve fitting equations: 
axt+b | cx+d (ax+b<50 ; cx+d>=50) were obtained from this tool for all the three 


parameters for both the syllables. 


Duration(D) Syllable 1 Syllable 2 
2.04 d + 7.48 2.73d+2.51 
n<50 n<50 
3.50d+ 3.84 3.74d + 1.72 
n>50 n>50 
Linear Curve Equation (1) 5.54d+ 11.32 6.47 d+ 4.23 
Pitch(P) 6.21 p+2 4.95p+2.20 n<50 
n<50 5.91p+2.12 n>50 
132p+1.48 n>50 
Linear Curve Equation (2) 7.53 p+ 3.48 10.86 p + 4.32 
Intensity(1) 6.611+ 6.58 8.06 1 + 6.43 n<50 
n<50 4.881+ 6.64 n>50 
8.271+ 6.51 
n>50 
Linear Curve Equation (3) 14.87 1+13.09 12.941 + 13.07 


Table 4/8: Piecewise Linear Curve Fitting Equations of Acoustic Parameters 
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Averaging over two syllables, the linear equations of all the three acoustic parameters 
which influence the lexical stress are: 

f(d) = 6.00d + 7.77 

f(p)= 9.19p + 3.90 

f(i) = 13.901 + 13.08 


The normal distribution curve for all the three acoustic parameters is as below using 


Standard Deviation and mean of the data averaged over two syllables. 


Analyzing the above functions derived from the stress patterns of the recorded 
samples, the corresponding weightage factors of the acoustic parameters have been 


calculated. Thus the empirical stress function (s,) can be defined as 


S¢ = 0.49d+ 0.16p+0.35i where 
d is the duration (ms) 
p is the pitch measured in terms of frequency (hz) 


i is the intensity (db) 


This reveals that duration and intensity have higher importance in determining lexical 
stress as compared to pitch. 

The syllabic weight of all the syllables using above stress function will be calculated 
and the heaviest syllable needs to be identified for determining the strongest syllable 


in a word. 


4.11 Data Analysis 


Statistics and Probability are interrelated but separate academic disciplines. Statistical 


analysis often uses probability distribution. 


When a frequency distribution is normally distributed, we can find out the probability 
of a score occurring by standardizing the scores, known as standard scores (or z 


scores). 
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The standard normal distribution simply converts the group of data in our frequency 
distribution. Z-scores are expressed in terms of standard deviations from their means. 
The absolute value of z represents the distance between the raw score and the mean 
value in units of the standard deviation. z is negative when the raw score is below the 


mean, positive when above. Thus z-scores are a way to compare results. 


Fig 4/16: Normal Distribution of z-score 


The formula for calculating the standard score is given below: 


z-score = (x- p)/ 6 
Where: 
uw is the mean of the data 


o is the standard deviation 


Applying this definition on the stress data: 


Z-score = (S;- S;)/ 6 
where 
St is reference value taken as 0 to get the point to the left of which intra-syllabic 
stress is negative 
S; 1s the mean of the intra-syllabic stress (in %) 


o is the standard deviation 
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The z-score gives the percentage that scored lower than the reference value. 

A standard normal table (Z Score Table) gives the values of the cumulative 
distribution function (®) of the normal distribution. Using this value of z-score, a 
value will be obtained from the online Z Score Table. This value gives the probability 
of the score lower than the defined reference value that gives the lexical stress pattern 


for the given range of the data. 


The heuristic is an approach to problem solving, learning or discovery that employs a 
practical method, not guaranteed to be optimal, perfect, logical, or rational, but 
instead sufficient for reaching an immediate goal. Thus taking a clue from 80-20 rule, 
the stress rules will be evolved based on minimum 80% probability of occurrence of 


that rule in the given data being analysed using above defined heuristic approach. 


4.11.1 Di-syllabic Words 


Di-syllabic words have highest frequency of occurrence in Punjabi hence lexical 
stress will be experimentally examined for these words and findings will be 
extrapolated. The frequency of tonal words is only 10-15%, therefore the basis will 
be evolved first for di-syllabic non-tonal words and then will be validated for tonal 


words. 

a) _ Di-syllabic Non-Tonal Words 

Ninety five phonetically annotated words are being analysed. The stress of syllable 1 
and syllable 2 of each word was calculated using the empirical stress function (s;). 


The intra-syllabic stress was calculated using the formula sti = ((Siz- sti) / St1)* 100 and 


Sti 1s tabulated as at Annexure I in Appendix D, where i varies from | to 95. 
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Mean of sii for 95 words (S; ) = 4.50 


n 


Standard Deviation (o) = sqrt [(1/n) (oes). ease eee eq (1) 
i=0 


Where n=95; h=2 
Using this formula, o = 3.11 
The range is -3.65 <s,< 11.51 


The normal distribution curve is: 


—— —— 


483 -172 9 139 45 7.61 10.72 13.83 
Fig 4/17: Graph of the Distribution of Lexical Stress Data of Non-Tonal Di-syllabic 
Words 


It is observed that majority of data is positive which reflects that the stress lies on the 
second syllable. The probability (P,) of words having intra-syllabic stress less than 0 
can be calculated using z score. 

z = (0-4.50) / 3.11= -1.44 

The corresponding value of ® against this z-score is calculated given below: 


P,(x <0)=P,(z<-1.44) = 0.075 i.e. 7.5 % (marked red in fig) 


Therefore there is 92.5% probability of lexical stress being present on second syllable 


(ultimate syllable). 
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S. Word Rule Probability of | Exception 
No. category occurrence 
1. Di-syllabic Stress on 0.93 - 
non-tonal syllable 2 
words (R1) 


Table 4/9: Lexical Stress Rules for Di-syllabic Non-Tonal Words 


Rule 1: Stress falls on ultimate syllable 


b) _Di-syllabic Toneme Words 


Sixty six phonetically annotated words have been analysed out of which 33 words 
contain toneme in initial syllable and balance 33 having toneme in final syllable. The 
stress of syllable 1 and syllable 2 of each word was calculated using the empirical 


stress function (s,). The intra-syllabic stress was calculated and s; is tabulated as at 


Annexure IT in Appendix D, where i varies from | to 66. 


Mean of si for 66 words (S ) = 0.99 


Using eq (1), Standard Deviation (0) = 7.86 


Where n=66; h=2 


The range is -14.36 < s; < 20.45 


The normal distribution curve is: 


z = (0-.99)/.7.86= -.12 


P, (x <0) =P, (z<-.12)=0.45 ive. 45 % 


163 


259 -1473 687 0.99 8.85 16.71 24.57 


Fig 4/18: Graph of the Distribution of Lexical Stress Data of Di-syllabic Toneme 
Words 


The Rule 1 (R1) is largly applicable for tonal di-syllabic words also however it is 
noted from the above figure that words towards the left side of the mean do not carry 
any lexical stress as these contain toneme in the initial syllable due to which the 
syllable 1 also becomes emphasised and counter balances the stress which is generally 


observed on syllable 2 in such words. 
Rule 2: No Stress 


Hence the Rule Table can be represented as: 


S. Word Position of Rule 
No. category toneme 


1. Di-syllabic Final syllable | Stress on syllable 2 (R1) 


Toneme | Tnitial syllable No Stress (R2) 


words 


Table 4/10: Lexical Stress Rules for Di-syllabic Toneme Words 
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c) Di-syllabic Words (consisting of cosonant /h/ or conjuncts of /fi/) 


Twenty four words containing cosonant /h/ and conjuncts of /A/ were examined. The 
stress of syllable 1 and syllable 2 of each word was calculated using the empirical 
stress function (s;). The intra-syllabic stress of was calculated and s; is tabulated as at 


Annexure II in Appendix D, where i varies from | to 24. 


Mean of si for 24 words (S ) = 4.82 


Using eq (1), Standard Deviation (0) = 5.40 
Where n=24; h=2 


The range is -3.07 <5, < 14.8 


The normal distribution curve is: 


vA 


vA 


ne ag 


-11.38 -5.98 -0.58 4.82 10.22 15.62 21.02 


Fig 4/19: Graph of the Distribution of Lexical Stress Data of Di-syllabic Words 
(consisting of cosonant /h/ or conjuncts of /fi/) 
It is observed that majority of data is positive which reflects that the stress lies on the 
second syllable. The percentage of words having intra-syllabic stress less than 0 can 
be calculated using z score. 
Z = (0-4.82)/5.40= - 0.892 
P,(x <0) =P, (z<-0.892) =0.186 i.e. 18.6 % (marked red in fig) 
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Therefore there is 81% probability of lexical stress being present on second syllable. 


The Rule 1 (R1) is largely applicable for this category of words however it is noted 


from the above figure that three words marked in the red region do not carry any 


lexical stress. Hence the rule table can be represented as: 


(consisting of cosonant 


/h/ or conjuncts of /fi/) 


(RI) 


S. Word category Rule Probability of | Exception 
No. occurrence 
‘iF Di-syllabic Words Stress on syllable 2 0.81 - 


Table 4/11: Lexical Stress Rules for Di-syllabic Words (consisting of 


cosonant /h/ or conjuncts of /fi/) 


4.11.2 Tri-syllabic Words 


In tri-syllabic words, the stress may fall on final syllable or penultimate syllable in 


case of European languages as discussed in section 4.6.1. This needs to be examined 


in the context of Punjabi. The experimental data of second and third syllable will be 


compared and stress will be reported along with exceptions if any. 


a) Tri-syllabic Non-Tonal Words 


Thirty phonetically annotated words have been analysed out of which three words 


contain toneme in initial syllable. The stress of all three syllables of each word was 


calculated using the empirical stress function (s,). The intra-syllabic stress of was 


calculated using the formula sj = ((Si-Si2)/St2 )* 100 and s; is tabulated as at Annexure 


IV in Appendix D, where i varies from | to 30. 


Mean of s for 30 words (S ) = 4.5 


Using eq (1), Standard Deviation (a) = 3.25 


Where n=30; h=2 
The range is -1.48 <5, < 10.85 
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The normal distribution curve is: 


Fig 4/20: Graph of the Distribution of Lexical Stress Data of Tri-syllabic Non-Tonal 
Words 


It is observed that majority of data is positive which reflects that the stress lies on the 
third syllable. The percentage of words having intra-syllabic stress less than 0 can be 
calculated using z score. 

Z = (0-4.5)/3.25= -1.38 

P,(x <0)=P,(z<-1.38) = 0.083 1.e. 8 % (marked red in fig) 


Therefore there is 92% probability of lexical stress being present on third syllable. 

The Rule | (R1) is largely applicable for this category of words and stress falls on the 
third syllable. However, it is observed from the data that syllable 1 is stronger in case 
of occurance of toneme in initial syllable hence the stress gets counter balanced and 


there is no need to mark stress. 


Hence the rule table can be represented as: 
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S. Word Rule Probability of Exception 
No. category occurrence 
Is Tri-syllabic Stress on syllable 3 0.92 - 
non-tonal (R1) 
words 


Table 4/12: Lexical Stress Rules for Tri-syllabic Non-Tonal Words 


b)  Tri-syllabic Toneme Words 


Twenty eight phonetically annotated words have been analysed out of which 7 words 
contain toneme in initial syllable. The stress of all three syllables of each word was 
calculated using the empirical stress function (s;). The intra-syllabic stress of 28 
words was calculated using the formula s+ = ((si3-St2)/Si2)* 100 and sj 1s tabulated as at 
Annexure V in Appendix D, where i varies from | to 28. 

Mean of s, for 28 words (S; ) = 4.85 

Using eq (1), Standard Deviation (a) = 5.92 

Where n=28; h=2 

The range is -4.54 < s, < 16.49 


The normal distribution curve is: 


-12.91 -6.99 -1.07 4.85 10.77 16.69 2261 


Fig 4/21: Graph of the Distribution of Lexical Stress Data of Tri-syllabic Toneme 
Words 
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It is observed that majority of data is positive which reflects that the stress lies on the 
third syllable. The percentage of words having intra-syllabic stress less than 0 can be 
calculated using z score. 

z= (0-4.85) / 5.92= - 0.81 

P,(x <0)=P,(z<-0.81) =0.20 ie. 20 % (marked red in fig) 


Therefore there is 80% probability of lexical stress being present on third syllable. It is 
observed that seven words (* marked) containing toneme as onset in the initial syllable 
have s,; as the heaviest syllable and thus stress falls on the syllable 1 in these cases for 


example: 


sHat /pasuti/ (refer Annexure V in Appendix D) 


The Rule 1 (R1) is largely applicable for this category of words and stress falls on the 
third syllable however observing from the data, it is noted that syllable s; is stronger if 
it contains toneme in the initial syllable. In this case the stress which generally falls on 


the ultimate syllable gets counter balanced. Hence the rule table can be represented as: 


S. | Word category Position of Rule 
No. toneme 
iB Tri-syllabic Medial / Final | Stress on syllable 3 
Toneme words syllable (R1) 
Initial syllable No Stress (R2) 


Table 4/13: Lexical Stress Rules for Tri-syllabic Toneme Words 


c)  Tri-syllabic Words (consisting of cosonant /h/ or conjuncts of /fi/) 


Twenty eight words containing consonant /h/ and conjuncts of /h/ were examined. 
The stress of syllable 1, syllable 2 & syllable 3 of each word was calculated using the 
empirical stress function (s;). 

The intra-syllabic stress of was calculated using the formula sti = ((St3-Si2)/Siz2 )* 100 
and s,; is tabulated as at Annexure V1 in Appendix D, where i varies from | to 28. 
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Mean of si; for 28 words (; ) = 3.82 


Using eq (1), Standard Deviation (a) = 4.5 
Where n=28; h=2 
The range is -8.03 < 5; < 17.6 


The normal distribution curve is: 


= 


-9.6 -5.18 -0.6 3.82 8.32 12.82 Us ¥- 


— 


Fig 4/22: Graph of the Distribution of Lexical Stress Data of Tri-syllabic Words 


(consisting of cosonant /h/ or conjuncts of /fi/) 


It is observed that majority of data is positive which reflects that the stress lies on the 
third syllable. The percentage of words having intra-syllabic stress less than 0 can be 


calculated using z score. 


z = (0-3.82)/4.5= - 0.85 
P.(x <0)=P,(z<-0.85) =0.19 i.e. 19 % (marked red in fig) 


Therefore there is 81% probability of lexical stress being present on third syllable. 


The Rule 1 (R1) is largely applicable for this category of words however it is noted 
from the above figure that three words marked in the red region do not carry any 


lexical stress. Hence the rule table can be represented as: 
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S. Word category Rule Probability of | Exception 


No. occurrence 


1. | Tri-syllabic Words Stress on 0.81 - 
(consisting of syllable 3 (R1) 
cosonant /h/ or 


conjuncts of /fi/) 


Table 4/14: Lexical Stress Rules for Tri-syllabic Words (consisting of 


cosonant /h/ or conjuncts of /fi/) 


4.11.3 Poly-syllabic Words 


Twelve poly-syllabic (8 quadri-syllabic & 4 penta-syllabic) annotated words have 


been analysed. The syllabic weights of all the syllables of each word were calculated 


and have been tabulated as at Annexure VII in Appendix D. 


Word Exception 
fewer ff 
Poly-syllabic Stress on ultimate Noted in 50% of the 


Stress on penultimate Noted in 50% of the 
syllable (R3) words 


No Stress (R2) Toneme in initial syllable 


Table 4/15: Lexical Stress Rules for Poly-syllabic Words 
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4.12 Findings and Discussion 


S. Word category 


Rule 


1. | Di/ Tri-syllabic non-tonal words 


Stress on ultimate syllable 


2. | Di /Tri- 
syllabic 
Supra- 
Laryngeal 


tonal words 


Toneme in medial or 


final syllable 


Stress on ultimate syllable 


Toneme in initial 


syllable 


No Stress 


3. | Di/Tri-syllabic words (consisting of 


cosonant /h/ or conjuncts of /f/) 


Stress on ultimate syllable 


4. | Poly- 
syllabic 


words 


Noted in 50% of the Stress on ultimate syllable 
words 

Noted in 50% of the Stress on penultimate 
words syllable 

Toneme in initial No Stress 


syllable 


Table 4/16: Lexical Stress Marking Rules 


The difference between strong and weak syllables is of linguistic importance as 
discussed in section 4. In this context few examples of words having functional stress 
which is phonemic has been discussed in section 4.8.1. The experimental work 
carried above has focused on identifying the strongest syllable in a word so that stress 


can be marked on this syllable based on which prosody modeling can be done for text 


to speech system development. 
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Chapter 5 
Acoustic Variability of Schwa 


5. Introduction 


Speech science is the study of all the factors involved in producing, transmitting, 
perceiving and comprehending speech, including all relevant aspects of anatomy, 
physiology, neurology and acoustics, as well as phonetics. Speech analysis began in 
1940 in the United States of America. The study of speech production from an 
acoustical point of view provides the means for looking at a very complex process 
in a simple way. The source of sound with which we are most concerned is the 
human voice. Here fluctuations in air pressure are caused by a variety of means. The 
most important of these is the rapid opening and closing of the vocal cords. Each 
time the vocal folds are closed pressure is built up, which is suddenly released when 
they are opened. Consequently the rapid opening and closing of the folds causes a 
series of sharp variations in air pressure. The air in the vocal tract will vibrate in 


different ways when the vocal organs are in different positions. 


Speech sounds in a language are generally classified in two broad categories, viz, 
segment and supra-segmental. Segmental sounds are further divided into vowels and 
consonants. Supra-segmental sounds are classified into stress, tone, nasalization etc. 
Vowels can be defined in terms of both phonetics and phonology. Phonetically, they 
are sounds articulated without a complete closure in the mouth or a degree of 
narrowing which would produce audible friction; the air escapes evenly over the 
centre of the tongue. If air escapes solely through the mouth, the vowels are said to 
be oral; if some air is simultaneously released through the nose, the vowels are 
nasal. It is very difficult to classify the vowels and this classification is usually 
carried out using acoustic or auditory criteria, supplemented by details of lips 
position. There are several systems for representing vowel position visually. From a 
phonological point of view, vowels are those units which function at the centre of 


syllables. 
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In some approaches, the term ‘vowel’ is reserved for the phonological level of 
analysis; vocoid is then used for the phonetic vowel which generally is called a 


semi-vowel also. 


Soft Pslste 


Fig 5/1: Tongue Positions in Production of Vowels 


In the production of vowels, air stream coming from the lungs passes through the 
oral cavity without any obstruction. While producing vowels, different parts of the 
tongue move to different heights within the oral cavity, the shape of the lips is 
modified. In the production of vowels, vocal cords may vibrate to produce voiced 
vowels. The nasal passage remains closed when the non-nasal oral vowels are 
produced and it remains open allowing the air stream to pass through the nasal 
cavity thus producing nasalized vowels. Point 1 in the above Figure indicates the 
height to which the front of the tongue can be raised in the production of vowel 
sound. Points 1 & 4 represent the front close and the back close position 
respectively. Point 3 represents back open unrounded vowel i.e. /a/. Point 2 
represents front unrounded vowel between half-open and open position i.e. vowel 


/oe/. 


Vowel systems vary greatly in their complexity from language to language. English 
happens to be relatively rich in vowel contrasts, with the added complexity that the 
vowel system is by no means uniform across the English-speaking world. Lindblom 
(1986), provides a brief but useful survey of ‘some facts’ about vowel systems as 
well as discussion of how languages exploit the ‘vowel space’. His paper includes 


references to both classic and recent work on universal aspects of vowel systems. 
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At the end of the nineteenth century, scholars began to feel the need for a 
standardized and internationally acceptable, system of phonetic transcription. 
Although there was and still is much to be said for non-alphabetic system of 
representation, it is the International Phonetic Alphabet (IPA) developed and 
promulgated by the International Phonetic Association since 1888 which with or 
without minor modification is now most widely used by linguists. The basic 
principle upon which the IPA is constructed is that of having a different letter for 


each distinguishable speech-sound. 
Primary & Secondary Cardinal Vowels 


A reference system of vowel pronunciation in terms of the vowel sounds that is 
independent of any given language has been devised. A famous example of such a 
system is the Cardinal Vowels. Daniel Jones (1976) postulated the vowel 
quadrilateral and the cardinal vowels, a Primary set and a secondary set of cardinal 
vowels. Each set comprising eight vowels, the choice of 8 vowels in the primary 
cardinal vowel system was probably strongly influenced by the vowel system of late 
19"/early 20" century. A given cardinal vowel is described by its articulation in 
terms of three dimensions: tongue height, front-back position of the tongue and 


degree of rounding. 


Primary Cardinal Vowels Secondary Cardinal Vowels 
Front Back Front Back 
Close i U y Ww 
Close-mid e O @ ¥ 
Open-mid € A) ce A 
Open a a « D 


Table 5/1: Cardinal Vowels (1) 
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The primary and secondary cardinal vowel categories provide a suitable framework 


for comparison for many languages. 


Central 


je 8 


Open-mid 


Fig 5/2: Cardinal Vowels (ii) 


The neutral Schwa vowel sound is produced without tightening the throat and vocal 
cords, which is not the case for the other vowel sounds. Recasens (1991), claims that 
Schwa is the vowel with the highest degree of variability; hence it is important to 


discuss this in detail. 
Quality of Schwa 


Schwa is an important part of the vowel space but is considered as a weak vowel as 
compared to other vowels. The pitch of the neutral Schwa vowel sound is low and it 
is barely audible. It goes by so fast when someone is speaking that you may not even 
notice it’s there. Thus orthographically also it is not written in some cases. To 
produce the neutral Schwa vowel sound, the throat must be relaxed and the air 
passage must remain open. The mouth will remain open slightly as well in order to 
produce this sound. Schwa is often taken to be a mid-central vowel, in accordance 
with the denotation of the Schwa symbol [a] in the International Phonetic Alphabet. 
The three phonological processes of vowels are well known viz. Initial Vowel 


Truncation, Vowel Reduction and Vowel Deletion. 
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Initial Vowel Truncation 


Vowel Schwa when used in initial position of the word sometimes gets truncated in 
pronunciation. This phenomenon may vary from language to language and is also 
speaker dependent. This variation may also be found in various dialects of a 


language. 


Vowel Reduction 

Phonetic reduction most often involves a centralization of the vowel that is, amount 
of movement of the tongue in pronouncing the vowel is reduced, as with the 
characteristic change of many unstressed vowels at the end of English words to 
something approaching Schwa. A well-researched type of reduction is that of the 
neutralization of acoustic distinctions in unstressed vowels, which occurs in many 
languages. Vowel reduction is a phenomenon that happens around the world, 
according to different rules for each language. The most common reduced vowel is 
Schwa which is particularly vulnerable to the co-articulatory effects of adjacent 


consonants. 


Vowel Deletion 

An elision or deletion is the omission of one or more sounds (such as a vowel, a 
consonant, or a whole syllable) in a word or phrase. The word elision is frequently 
used in linguistic description of living languages, and deletion is often used in 
historical linguistics for a historical sound change. Many studies have confirmed that 
Schwa deletion is influenced by multiple factors such as lexical stress position, 


sonority, lexical frequency, word length, phonotactic environment and speech style. 


The basis for the weakness of Schwa has been the subject of much research by 
phonological experts Van Oostendrop (2000), but much less attention has been 
devoted to the question of what the phonetic characteristics of Schwa vowel are, 


hence acoustic analysis of only Schwa vowel has been undertaken in this thesis. 
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5.1 Variations of Schwa in English 


In linguistics, mainly phonetics and phonology, Schwa is the mid-central vowel 
sound amidst the vowel chart, indicates by the IPA sound /9/. It was first utilized in 
English texts between 1890 and 1895. In Hebrew writing, “shva” /:: (two vertical 
dots) / is a vowel diacritic that can be written under letters to indicate an ‘eh’ sound 
(which is not the same as our Schwa). The term was first used in linguistics by io" 
century Germany philologists, which is why we use the German spelling, “Schwa”. 
Styler (2012), discussed the difference between Schwa /o/ and wedge /a/. The 
difference between /a/ and /a/, at a fundamental level, is that /a/ is a reduced vowel, 


whereas /a/ is a full vowel. The language-specific Variations of Schwa are being 


discussed in this section. 


5.1.1. Schwa in British English 


In English, there are 44 distinctive speech sounds 20 of these are vowels and the 
remaining 24 are consonants. /9/ is a very frequently occurring vowel in English. It 
occurs only in unaccented syllables. The vowel is articulated with 2 different 
tongue-positions, depending upon whether it occurs finally in a word or elsewhere. 
During the articulation of non-final /9/, the centre of the tongue is lifted towards the 
roof of the mouth to a height along with half-close and half-open. The lips are 
neutral. Non—final /9/ is therefore a central unrounded vowel lies between half-close 


and half-open. 


high 


mid-high 


SD: 
mid 


mid-low 


D 


low 


Fig 5/3: Cardinal Vowels (British English) 
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/9/ occurs initially, medially and finally in a word. 
Initial appoint /a'pomt/ 
Medial ‘excellent /‘eksolont/ 
Final ‘drama =_//‘dramo/ 


English is a stress-timed language displaying phonological vowel reduction: weak 
vowels, such as Schwa [a], are part of the phonological form of many words in the 
language. Schwa in English is mainly found in unstressed positions, but in some 
other languages it occurs more frequently as a stressed vowel. It is a particularly 
frequent vowel in English, as it is the one most commonly heard when a stressed 
vowel becomes unstressed, e.g. telegraph becoming telegraphy / telogra:f/ 


/ta'legrofi/. 


5.1.2. Schwa in American English 


Vowel reduction is a prominent feature of American English, as well as other stress- 
timed languages. The vowel /a/ is an unrounded mid-back morpheme, more or less 
lowered and fronted. It take places before all consonants excepting /h,z,j,w/ e.g 
supper /sapper/, cup /cap/, nut /nat/ etc. it can also precede clusters consisting of a 
resonant and a plosive e,g. hunt, bundle, punch etc or a plosive alone e.g. husk and 
lust. Schwa /9/ is used only in unstressed syllable word initially medially and finally 


e.g. initial arise /orise/, medial begin /ba' gin/, final comma/'kamo/. 


Front Near front Central Nearback Back 


Close 1 


Near close 
Close mid 
Mid 

Open mid 
Near open 


Open 


Fig 5/4: Cardinal Vowels (American English) 
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5.2 Schwa in Indo-Aryan Languages - Literature Survey 


The Schwa sound in Greek, Latin, and Sanskrit (where it is called a svarabhakti 
vowel) and the notation of <a> was used for Indo-European languages. The modern 
Indo-Aryan languages also prevalently use this notation. These languages are 
spoken in most of the north and centre of the Indian subcontinent, with outliers in 
Sri Lanka and the Maldives. Hindi-Urdu and Bengali are by far the largest; of the 
remainder, Marathi in south of the main area, Gujarati in south-west, Sindhi to the 
west, Punjabi in the north-west, Assamese in the east, Oriya in the south-east and 
Sinhalese in Sri Lanka all have a current literary standard and are linked to major 
political units. Others such as Bhojpuri or Maithili also have speakers in ten of 
millions. Across the main area, separate languages have arisen largely by division 
with a geographical continuum. The occurrence of vowel Schwa in these languages 
will be discussed in this section as deliberated by Pandey (2014) and many other 


linguists. 


5.2.1 Assamese 


Assamese has eight oral vowels. Vowel harmony is a distinguishing feature of the 


Assamese vowel system. 


Fig 5/5: Cardinal Vowels (Assamese) 


Mahanta (2012), discussed the Vowel Triangle of Assamese. It is observed that there 


is no Schwa in Assamese. 
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5.2.2 Bengali 


Schwa viz. /e/ is open-mid central rounded vowel in Bengali. Vowel sequences of 


two and three occur e.g. /ee eo/, /eeo sea/ 


Front Central Back 
Close i u 
Close-mid e re) 
Open-mid ra re} 
Open a 


Table 5/2: Cardinal Vowels (Bengali) 
Schwa /e/ in Bengali is a mid-low vowel and is realized as full vowel e.g. 
/mel/ ‘dirt’ [emol] ‘pure’,[enek] ‘many’. 
5.2.3 Dogri 


Dogri also uses mid-central open Schwa like Punjabi. In addition it has vowel 


allophone /3/1.e. extra short Schwa. 


5.2.4 Gujarati 


Murmur has been reported in Gujarati vowels which are attributed to loss of h in 


casual and rapid speech. Thus Schwa also gets breathy and is represented as /9/ e.g. 
/mahino/ [mg jno] ‘month’ 
/paholti/ [pg lt] ‘broad’ 
There are two allophones of Schwa /9/ viz [3] [9:] 
Example 


’ 


/woahelti/ [ve lt] ‘early 
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5.2.5 Hindi 
Schwa does not occur word finally. 
Allophone [e] < /9/ occurs when followed by /h/, e.g. 


eal /keena/ 
ext /leer/ 


5.2.6 Kashmiri 
Kashmiri has in addition a long Schwa also. Both the Schwa do not occur word 


finally. 


5.2.7 Konkani 


In Konkani, Schwa is a close-mid central vowel. There are two allophones of 
Schwa viz. The raised, [9] and [9] lowered allophones occur before high and low 


vowels as dipthongs i.e. /ai au/. 


5.2.8 Maithili 


Schwa is close-mid central vowel in Maithili and is instrumental in formation of 
geminates similar to Punjabi. Geminate consonants occur intervocalically. They are 


however in free variation with single consonants in this position. 
/patta/ [‘potta] 
Schwa also is found as part of two and three-vowel clusters. 
/ jou iau oia qua ula / 


5.2.9 Sindhi 


Schwa occurs in the end of syllables unlike other Indian languages. Sindhi syllables 
in most of the cases end with vowels or semi-vowels and consonant can occur at 


initial, medial and final position of words Jatoi (1983) 
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For example: 


S. Vowel Sindhi English | Sindhiin | English 
No. contrast in IPA IPA 
1 /a/ /9/ saro Miss sora reeds 


Table 5/3: Occurrence of Schwa in end of Open Syllable in Sindhi 


5.2.10 Urdu 


In Urdu first alphabet alif (') is also used to represent Schwa /9/. The behavior 


of alif in various contexts is described below: 


Alif + jabar on top = /a/ and is used as full vowel in the initial position of the 


word e.g. 
(el 
Alif+ madd= /a/ in the initial position and is used as full vowel e.g. 
(ATA) el 
Alif in the medial and final position of a word is used as /a/ matra. 
Alif + Zer below alif =/I/ and is used as full vowel in initial position e.g. 
(EAH) ool 
Schwa is also found as part of vowel consonant clusters: 
Alif + vao = /ao/ which is a diphthong. e.g. 


(Tat) 3! 
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5.3 Phonetic Variations of Schwa in Punjabi 


The Schwa has been the subject of much research by phonologists yet substantially 
less consideration has been dedicated to study of the phonetic attributes of Schwa 
vowel. Punjabi is a tonal language wherein Schwa is a short neutral vowel sounds 
like every single other vowel, its exact quality changes depending upon the adjacent 


consonants, which needs to be investigated. 


5.3.1 Occurrence of Schwa in Isolated Words 
(i) Word-initial Schwa or inherent Schwa in a consonant cluster (CC) and 


also Schwa as a tone bearing unit 


e.g. MHS (/asan/), AHE (/kasok/), YT /kar/ 


(ii) Nasalized Schwa 


e.g. SSE" /rSbdna/, FAS/basst/ 


(iii) Schwa associated with Geminated Consonant as Onset 


e.g. BHE'/ pidgdzona/, SFE/budzdz5na/ 


(iv) Schwa as Release Vowel 


Schwa doesn’t occur in word-final position in Punjabi Panday Pramod (2014), 


however it is observed as consonantal release in words ending with closed syllable 


e.g. Bind /tfadgor/, MMSE /guaid'n /, EAE /tdkkon_/ 


5.3.2 Schwa as Release Vowel in Sentences 


The most psycholinguists have discussed the selection of lexical concepts and the 
generation of a syntactic structure of a sentence appropriate for conveying the 


speaker’s intended meaning or “message”. 
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Levelt (1989, 1992) argues that the unit of phonological encoding is the phonological 
word. He postulates a prosody generator that takes as input the rhythmic information 
about the selected words and combines them into phonological word frames. The 
phonological segments for each word are made available separately and then 
associated to the newly constructed phonological word frames in a left to right 


manner. 


Short release vowel schwa is observed in following Punjabi sentence in some 


speakers: 
(i) Awa ASM /mé kdr dzavaga/ 
(ii) A YS ASEM /mé kdr'o dzava/ 


This neutral Schwa vowel may sound like nothing or its something like a low 
volume, low pitch, very short grumble or grunt which ought to be verified 
experimentally. In this context, the phenomenon of Release Vowel as discussed in the 
context of Isolated Words needs to be examined in the context of a sentence and 


comparison needs to be drawn in both the acoustic contexts. 


5.4 Experimental Study 


5.4.1 Acoustic Parameters 


The following parameters will be used to study the Schwa quality: 
e Fundamental frequency (Fo) 

e Formants (F1, F2) 

e Acoustic space in terms of Fl and F2 

° Intensity, Duration, slope 


° Burst Energy (BE) 


The acoustic space is calculated in order to determine the tongue position involved 


in articulation. The few examples of different categories of words will be recorded. 


185 


There can be more than one Schwa in a word occurring in different contexts as 
discussed in section 5.3. The Schwa in the words being analyzed will be highlighted 
in the word list. Burst energy i.e. (Intensity * Duration) of Schwa vowel will be 


calculated to determine the quality of a vowel viz lax/tense. 


5.4.2 Methodology 
5.4.2.1 Data Selection, Recording and Annotation 


The list of phonetically balanced words was collated for this experimental analysis. 
The selection of the word will be prepared based on the criteria of occurrence of /a/ 
in different contexts as discussed in the section 5.3.1 by using available published 
dictionary from authentic sources such as Punjabi-English Dictionary, Punjabi 
University (2011). Phoneme level annotation of the data was done based on auditory 
perception. The Release Vowel study is not limited to only Isolated Words and is 
being extended to sentence level containing that word. The Isolated Words 
containing tonemes and their occurrence in two different sentences were taken for 
two sentences containing these isolated words to examine the significance of release 
vowel in the Punjabi language. The informants were selected from region of Punjab 
where Malwai dialect is spoken. Each informant recorded the entire set of words 
thrice. Out of this, words containing toneme will also be recorded in sentences 1.e. 
each word in two different sentences for study of release vowel in an isolated word 
viz-a-viz it’s occurrence in a sentence. The sentence data was recorded only for 8 
speakers (4 male & 4 female) and the corresponding Isolated Words were also 
recorded by these speakers for the study of release vowel.The recording and 
annotation of the data was carried out as per details discussed in section 1.8.2. The 
spectrographic analysis of all the male & female samples was carried out and 
phoneme level annotation was done and Release Vowel was marked in Isolated 


Words as well as sentences. 
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Fo, first two formants (Fl & F2), Intensity & duration of the schwa vowel under 
examination were recorded for each word by using PRAAT software for all the words 
being analyzed and also for the Release Vowel associated with Isolated Words as 
well as its occurrence in a sentence. Based on this data, the analysis of vowel quality 


will be carried out in various acoustic settings. 
5.4.2.2 Recording of Data Sheets 


The various acoustic parameters as discussed in section 5.3 were recorded and are 


given in the respective sections given below. 


5.4.2.3, Analysis of Schwa Vowel in Isolated Words 


Wilder (1975), Vowel height is inversely correlated with the frequency of the first 
formant: the higher the vowel (the higher the tongue position), the lower the F1. 
Vowel backness is reflected in the frequency of the second formant or more precisely, 
in the distance between the first and second formant frequencies. The frequency of 
the third formant does not change as much as that of Fl & F2. Formant frequencies 
higher that F3 are not considered important clues to the identity of the vowel. The 
production of nasalized vowel requires two resonators, the oral and nasal cavity. The 
difficult interface between these two resonators and the heavy damping of the nasal 
cavity give results in several differences between oral and nasalized vowels. In nasal 
vowel typically represent greater formants bandwidth, lower overall amplitude, a low 
frequency nasal formant. A traditional "vowel diagram" can be obtained by plotting 
the vowel formants in a graph where the horizontal axis is (F2-F1) and the vertical 
axis is inverse of Fl. Burst energy i.e. (Intensity * Duration) of Release Vowel was 
recorded for Isolated Words vis-a-vis its burst energy in a sentence to identify the 


quality of a vowel viz lax/tense in both the contexts. 


5.4.2.3.1 Oral Schwa Vowel 


Schwa occurs in only word-initial and word-medial positions in Punjabi language. 
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and is not represented orthographically but is phonetically realized. 


The word-medial Schwa is usually used functionally to break the consonant clusters 


Words IPA Fl Ee Ee 
WTS /gsan/ 625.1 1526.40 901.3 
wnt amit! 634.22 | 1414.22 780 
nH /aphim/ 649.33 1240.83 DoT 5 
nis fonak"/ 711.22 | 1526.78 | 815.56 

nifA /abias/ 637 1306.90 669.9 
ue /okal/ 660 1484.00 824 
nse /andd/ 665.1 1558.90 893.8 
nea /atfakk/ 603.8 1701.30 1097.5 
nea / atfakk / 751.8 1678.80 927 
wit /agge/ 620.8 1547.90 927.1 
— /hosab/ 669.5 1505.50 836 
JAY /hasab/ 606.1 1506.70 900.6 
aHat /kosok/ 594.25 | 1380.75 786.5 
ana /kasok/ 590.75 | 1406.45 815.7 
Age /sarab/ 626.1 1565.20 939.1 
Age /sorab/ 599.5 1569.20 969.7 
Aaa /socak/ 614 1609.90 9050 
Aga /sotak/ 610.50 1684.80 1074.3 
HIS /fagon/ 573.4 1685.00 1111.6 
Aas [fogon/ 577.7 1640.10 1062.4 
Hae /sdkot/ 582.2 1633.20 1051 
dae /takkon/ 683.3 1665.80 982.5 
Baer Nabbana/ 655.3 1500.70 845.4 
whAT /késsa / 592 1562.00 970 
aH (bosst/ 492.5 1527.40 | 1034.9 
wes /dorlabb/ 814.3 1472.19 | 657.89 
sa itfage/ 748.1 1686.50 938.4 
Sut /boggi/ 584.2 1457.87 873.67 

ug /kollukara/ 654.6 1561.00 906.4 

Average 635.40 | 1538.14 | 902.74 


Table 5/4: Fl & F2 — Oral Schwa 
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5.4.2.3.2 Nasalized Schwa (8) 


Schwa before a nasal in the same syllable tends to be nasalized. The few examples of 


Schwa with nasalization are shown below: 


Words Fl F2 F2-F1 
mide! ansd/ | 541.39 | 1498.20 956.90 
HH) besst/ | 560.80 | 1505.90 945.10 
HAE! skat/ | 593.10 | 1427.20 904.10 
"iGS/ Sgur/ | 499.78 | 1274.40 774.63 
wey tsda/ | 47650 | 1557.20 | 1080.70 
ABET /sSd’na/ | 597.00 | 1653.70 1131.70 
eet /kadti/ =| 453.67 | 1456.77 | 1003.10 
aS /gsd5la/ | 475.39 | 1615.90 1140.10 
By /badiia/ | 471.33 | 1359.33 888.00 
SSE /rSbSna/ | 59549 | 1537.20 947.10 
mis /Sdéra/ | 491.40 | 1594.90 | 1103.50 

Average 534.99 | 1502.19 967.20 


Table 5/5: Fl &F2 — Nasalized Schwa 


5.4.2.3.3, Schwa Associated with Geminated Consonant as Onset (2,) 


The effect of occurrence of geminated toneme as onset of the syllable containing 
Schwa (9,) needs to be examined to understand the variation of schwa in this 


context. 
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Words F1 F2 F2-F1 


wsET /bo'dgdzoena/ | 519.99 1861.9 1351.00 


BE /ri'dgd3sena/ | 594.70 | 1901.6 1396.90 


BE /pi' dgdgogna/ 493.10 1927.4 1434.30 


EUS /1'ddagt/ 624.40 1729 1104.60 
Average 539.42 1783.58 1244.16 


Table 5/6: Fl & F2 — Geminated Schwa 


The analysis of above examples reveals that the tongue moves higher and forward in 


the phonetic realization of schwa in such cases. 
5.4.2.3.4 Schwa as Release Vowel (9,) 


Schwa doesn’t occur in word-final position. In Panday (2014) however it is 
observed as consonantal release in words ending with closed syllable which is 


termed as Release Vowel (RV). The examples are shown below: 


Words IPA Fl xe ea, 
ug [tater 554.08 1681.56 | 1127.48 
EMS eco 565.48 |_1694.52_| 1129.04 
nea /atfakko,/ 5182 1539.6 1021.4 
HRY ees 535.37 1545.62 | 1010.25 
HE /sadgo,/ 436.98 1766.85 | 1329.87 
sing /tfadgore,/ 539.99 1682.13 | 1142.14 
dae / tokkang, / 556.94 1606.6 1049.66 
WE ae 523.42 | 1675.41 | 1151.99 
ya /tfoggor! 515.51 1586.65 | 1071.14 
Bo /tfut'o,/ 549.13 1518.61 969.48 
te Honey 556.42 1598.79 | 1042.37 
Contd.. 
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Words IPA Fl ue = 
ay /bodge,/ 451.06 1687.67 | 1236.61 
ee |e 471.43 1459.12 | 987.69 
fy 564.83 1602.56 | 1037.73 
shy — 517.84 1568.88 | 1051.04 
uy aie 490.42 1588.66 | 1098.24 
ey Niger! 597.81 1477.94 | $80.13 
Average 526.17 1604.775 1078.6 


Table 5/7: F1 & F2 — Release Vowel in Isolated Words 


5.4.2.3.5 Vowel Diagram of Schwa 


Based on the above findings, the acoustic space is depicted in the below graph for 


various phonological settings plotting the average values of each category as 


discussed above: 


750 700 650 600 550 500 450 400 
Fl 
o> 
ony 
Ba. 
e 
3 


@ Oral Schwa 


A Nasalised schwa a, 


@ Schwa associated with geminated 
consonant 


@ Release vowel in isolated words 


Fig 5/6: Fl, F2-F1 Plot of Schwa in different Acoustic Contexts 


191 


The acoustic variations of Schwa can be observed from F1 & F2 plot given below: 


2000 


1900 Ay 
A 
1800 
bs A 
1700 
i in a 
1600 oe * 
8, ym @ OralSchwsa 9 
o co 
1500 a 5 rs @ Nasalized Schwa 8, 
F2) 1400 s = * AGeminated Schwa 9 
a £ 
1300 a > @ Release vowel in tsolated words 
° 3 
1200 r 
1100 
1000 
300 400 500 600 700 800 900 


Fil 
Fig 5/7: Fl, F2 Scattered Graph of Schwa 


The range of Fl &F2 of Schwa vowel in above acoustic contexts is tabulated below: 


F1 Table 
Categories | F1 range Vowel height 
3 500-800 Mid 
Qn 450-600 Transition zone high 
Qe 500-650 
Oy 450-600 
Table 5/8: Fl Range in different Acoustic Contexts 
F2 Table 
Categories | F2 range Articulatory zone 
fe) 1200-1700 Central 
on 
Ie 1500-1950 Transition zone front 
or 1450-1800 Transition zone front 


Table 5/9: F2 Range in different Acoustic Contexts 
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Fig 5/8: Fl, F2 Average Values Bar Chart in different Acoustic Contexts 
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5.4.2.4 Comparison of Release Vowel (2,) in Isolated Words viz-a-viz Sentences 


The average of burst energy of Release Vowel in Isolated Words viz-a-viz average 


of burst energy when the same word occurs in two sentences is tabulated below: 


Word with in a Sentence Isolated Words 
Words Fl F2 Burst Energy Fl F2 Burst Energy 
(Duration (Duration * 
* Intensity) 
Intensity) 
wo 443.25 1792 2.71 470.00 1725.00 9.03 
/karoa,/ 
wae 391 1498 0.78 567.00 1730.50 6.04 
/takkonp,/ 
US /t5no,/ 409 1969.5 2.28 538.50 1686.00 9.23 
Ua /tdra,/ 497.5 1650.5 3.23 506.50 1705.50 10.87 
va 387 2342 2.26 494.00 1612.00 10.15 
/tfaggo,/ 
Ef) /tfuto,/ 475.5 1921 2.21 442.00 1522.00 10.02 
Contd.. 
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Word with in a Sentence Isolated Words 
Words F1 F2 Burst F1 F2 Burst Energy 
Energy (Duration * 
(Duration Intensity) 
* 
Intensity) 
Bing 408 1323.5 4.50 470.50 1724.50 8.72 
/tfadzoro,/ 
fw 502 1687.5 0.17 694.00 1876.00 5.18 
/tfigaro,/ 
YSwy 551.5 1695 2.53 499.00 1766.50 8.13 
/onkete,/ 
exey 556 1777.5 1.92 441.50 1471.50 9.19 
/odzbbgo,/ 
Hes 446.5 1569.5 3.40 516.00 1477.00 11.38 
/si@ lo,/ 
PES (a 514 1450 3.31 528.50 1584.50 8.05 
/atfakko,/ 
py /digo,/ | 546.5 1746.5 4.76 511.50 1625.00 9.66 
shy /taga,/ 426.5 1727.5 1.20 454.50 1567.00 8.48 
uy /pigo,/ 624 1940 2.30 453.00 1655.00 6.58 
Gy /iigo,/ 417 1750 1.48 515.50 1471.00 9.60 
HY /maga,/ | 593 1759.5 1.70 482.00 1601.50 7.12 
AY /sadzo,/ 448 1799 1.16 402.50 1940.00 6.95 
BE /bbd3o,/ 435.5 1930 0.61 418.50 1992.50 8.37 
Average 474.58 | 1771.17 2.24 495.00 1670.16 8.57 


Table 5/10: F1 & F2 - Isolated Words viz-a-viz Sentence 


Fundamental frequency (Fo) male & female speakers 


Male Female 
Isolated Words 190.11 263.33 
Word occurrence in sentence 188.32 268.58 


Table 5/11: Fo - Isolated Words viz-a-viz Sentence 
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5.5 Results & Discussion 


The variations as observed from above data analysis are reported below with 


reference to pure vowel /9/ characteristics: 


Punjabi Vowels 


Front 
Unrounde 


Fig 5/9: Cardinal Vowels (Punjabi)- Acoustic Variabiality of Schwa 


The data analysis from the above tables and graphs reveals that there is variation in 
the quality of Schwa in Punjabi language. The Schwa in IPA is indicated as mid- 
central vowel as discussed in this chapter however the data analysis shows changes 
in the vowel height and degree of backness. The following is observed from the 
above tables and graphs: 


e The values of Fl are decreasing for nasalized Schwa and also for Release 
Vowel associated with Isolated Words. Similar phenomenon is also observed in 
case the geminated toneme occurring as onset of the syllable containing Schwa. In 
these cases the vowel height is 20-25% higher. 

° The F2-F1 value is maximum for geminated toneme as onset of the syllable 
containing Schwa (9,) and decreases in case of Release Vowel associated with 
Isolated Words having closed last syllable (0,) & nasalized Schwa (9n) as compared 
to mid-central Schwa i.e. there is a relative shift in the vowel position towards the 


front. This shift in case of dy is negligible. 
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The Release Vowel in Isolated Words 6 ,;) gets shifted towards the front by 20 % in 
articulation. The major change in the place of articulation happens in case of 
which can be considered in between the front and central in the vowel triangle. 

° It is observed that Burst energy of Release Vowel in a sentence is much less 
(only 25 %) as compared to the Release Vowel associated with Isolated Words. 
Hence can be ignored. It is also noted that there is not much variations in 
fundamental frequency and the first two formants. 

e Thus the Release Vowel in a sentence gets suppressed due to the continuation 
of speech in the sentence due to accompanying intonation features whereas the 
release energy in maximum in Isolated Words due to un-interrupted pronunciation. 
Hence the Release Vowel in a sentence is not phonologically significant. Thus mid- 
central Schwa /9/ is a pertinent vowel in terms of acoustic variations as discussed 


above and can be represented in IPA as follows: 


(i) All nasalized Schwa (dn) / o/ 
(ii) Schwa associated with geminated consonant as onset (9¢) / o/ 
(iii) Schwa as Release Vowel (9,) /O/ 


The above findings can assist TTS developers in realizing natural speech in Punjabi 


TTS. 
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Chapter 6 
Correlation of Morpho-syntactic Features with Lexical Representation and its 


Co-articulation 


6 Introduction 


In spoken production, there is an intimate link between morphological and 
phonological processing. First and foremost, the output of morphological operations 
serves as the input to phonological processes. When morphological processes 
combine lexical representations (morphemes) to form a multi-morphemic word, the 
constituent sounds must also be combined in such a way that the resulting 
phonological representation is suitable for driving spoken production. 

The PLS provides inter-operable specification of pronunciation information which 
can be used for speech technology development. W3C PLS 1.0 represents the 
requirements of Latin script based languages with few examples mentioned for 
Japanese and Chinese, thus keeping the specification very broad, however it currently 
does not cover morphological, syntactic and semantic information associated with 
pronunciations (such as word stems, inter-word semantic links, pronunciation 
statistics, prosody etc.). POS is an available source for feature extraction for building 
NLP & speech systems. PLS based on morphology with overriding phonological 
features such as stress, tone, gemination, nasalization etc covering phonological 
words that contain maximum inflection under each POS category can be a useful 
resource for training of speech systems. An initial work on Part of speech (POS) and 
morphological pronunciations in Pronunciation Lexicon Specification (PLS) — 
Bengali has been carried out as discussed in section 1.7.1. The paper proposed 
addition of POS feature in PLS XML structure either as an attribute or an element. 
This can be used to choose the proper pronunciation among multiple pronunciations 
of the same orthography of a word. This information can reduce the search time in a 
large vocabulary recognition and synthesis system. This needs to be further 
investigated for Punjabi language. Therefore there is a need to standardized the tags 
to be used for part-of-speech information to be encoded in PLS data. 
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6.1 Standard POS Tag Set 


Parts of Speech tagging is one of the key building blocks for developing speech 
applications. A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads 
text in a language and assigns parts of speech tag to each word such as noun, verb, 


adjective etc. Punjabi has a rich base of POS based inflections e.g. 


IPA POS Gloss 


/okkarva/ JJ.M.S engraved, etched 


/okarvavna/ VM.M.S to get engraved, inscribed 


/okacvai/ N.F,S wages for 


Table 6/1: Example of POS Based Inflections 


The POS tag set for Punjabi language has been standardised as discussed in Paper 
“Standardization of POS Tag Set for Indian Languages based on XML 
Internationalization best practices guidelines” by Lata et al (2012), the same is 
enclosed at Annexure I of Appendix D. The prosodic features of Punjabi are 
discussed here with the help of examples transcribed in IPA using these standard POS 
Tags. 


6.2 POS Inflections in Punjabi 


Punjabi is highly inflectional language like most other Indo-Aryan Languages. POS is 
an important feature in Punjabi language. Main parts of speech (POS) in Punjabi are 
noun, pronoun, verb, adjective, adverb, preposition, conjunction and interjection etc. 
An affix is a morpheme that is attached to a word stem to form a new word. Affixes 
are divided depending on their position with reference to the stem as discussed 


below: 
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6.2.1 Prefix 


A prefix is a morphological unit, for example ‘un-' or 'multi-', which is added to the 
beginning of a word in order to form a different word. For example, the prefix 'un-' is 
added to 'happy' to form 'unhappy’. Use of prefixes is much lesser as compared to the 


use of suffixes in Punjabi. These are mostly used with Nouns, Adjectives. Its use with 


verbs is very rare. For example: 


Word 
UfIs 
ufser 
UIE 
ufas 

UfIBE 
ufsst 


6.2.2 Suffix 


A suffix is a morphological unit attached to the end of a word to form a new word or 


to change the grammatical function (or part of speech) of the original word. For 


IPA 
/pél/ 


/pél-a/ 
/pél-u/ 
/pél-e / 


/pél-on/ 


/pél-a/ 


POS 


JJ,M 


JJ,M 


Gloss 


first step/initiative 
First 
aspect/point of view 
first/foremost 


calved for the 
first time 


formerly/before hand 


example, the verb read is made into the noun reader by adding the suffix -er. 


Similarly, read is made into the adjective readable by adding the suffix -able. The 


addition of suffix may also lead to change in number, gender & person. The Punjabi 


examples related to these changes are covered in the following section. 
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6.2.2.1 Change in Grammatical Categories 


Word IPA POS Gloss 
Hs" /miid-a/ N,M,S boy 
H3 /mitid-e/ N,M,P boys 

Ht /mitid-Ia/ N,M,P boys 

Hee /mtid-Io/ N,M,P boys 


6.2.2.2 Word Inflection for Number, Gender and Person 


Inflection for Number change 


Word IPA POS Gloss 
Hs" /mitid-a/ N,M,S boy 
H3 /mtid-e/ N,M,PI1 boys 
oat / kur-i / N,F,S girl 
asm / kut-ia / N,F,P1 girls 


Gender change 


Word IPA POS Gloss 
ge / bod d-a/ N,M old man 
gt / bod d-i / N,F,S old woman 
wat /kor-a/ N,M horse 
west /kor-i/ N,F,S mare 
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6.3 Distinctive Features of Morphology-Phonology Interface 

The morphological structure of a complex word determines how the constituent 
morphemes of a word are realized phonetically. The phonological structure of a 
complex word reflects its morphological structure, but is not isomorphic to that 
structure. A native speaker understands that spoken words are made up of sequences 
of speech sounds and has the ability to hear and manoeuvre the sounds in spoken 
words. This ability is known as phonemic awareness. Phoneme is capable of 
distinguishing meanings of words. Phonemic awareness is a subset of phonological 
awareness in which listeners are able to hear, identify and manipulate phonemes, the 
smallest mental units of sound that helps to differentiate units of meaning 
(morphemes). Phonology plays a role in the selection of one from a set of competing 
affixes. The supra-segmental phonemes i.e. patterns of articulations due to presence 
of tone, stress, nasalisation, germination etc are phonemic. Hence the distinctive 
features of morphology-phonology interface of Punjabi language will be discussed in 
this section. These features are essential for the completeness of PLS and should 


necessarily be captured for complete phonological coverage of the language. 


6.3.1 Tone 


Tone in Punjabi language has been discussed in detail in chapter 2 & 3. The tonal 
minimal pairs based on three types of tone i.e. high-tone /6/, low-tone /o/ and mid- 
tone /d/ have been discussed in this section. Level tone is also phonemic however it is 


customary not to mark it in the pronunciation lexicon. For example: 


Word IPA POS TONE Gloss 
HS /son/ V,Aux Nil were 
HS /sdn/ N,M Nil year 
bite) /son/ N,M HighTone hole made by thieves 
ag /par/ N,M Low Tone load 
urg /par/ RB,Both NiL beyond 
ggg =s /bahor/ ~—s RB, Both NiL out 
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6.3.2 Nasalization 


Nasalisation is phonemic in Punjabi. Tippi (4) and Bindi (FA) are used to represent 


nasalisation. Functionally both are same however there are some rules in orthography 


with regard to use of tippi and bindi. Tippi is used only in conjunction with some 


vowels and matras i.e. [%, fe, fe, o ©] /o, I, I, U, u/ and rest of the vowels and 


ie Se SS 


matras use bindi. For example: 


Word IPA POS Gloss 

wa /kata/ N,F to subtract/decrease 
ujet /kdta/ N.M large bell 

A /so/ N hundred 

H / $3/ Vv to sleep 


6.3.3. Gemination 


Punjabi has a large number of geminates. In Punjabi, gemination is phonemic and it 


results in unique words. For example: 


Word IPA POS Gloss 
cH /das/ JJ digit ten 
CH /dass/ Vv to tell 

feat /alli/ JJ from heart 
eat /allli/ N delhi 
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Word IPA POS Gloss 


AS /sat/ N,M strength 
HS /satt/ QTC seven 
Act /soda/ RB always 
AT /sadda/ N,M invite 
weer /kota / N knee 
wecet /kottna/ Vv to press 


6.4 Word variants 


6.4.1 Free variations 


It is the phenomenon of two (or more) sounds or forms appearing in the same 
environment without a change in meaning. There is an alternative textual 


representation for the same word or phrase in Punjabi. For example: 


Word IPA POS Gloss 
doenrg /gorudvara/ N place of worship 
qdeeg /gordovara / N place of worship 
qsqst / kotkotari/ Vv tickling 

aqsqst /kotkoti / Vv tickling 
3 /pe / N fear 
3 / po/ N fear 
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6.4.2 Homonyms 


Homonyms are words which sound alike, but have different meanings. There is 
abundance of homonyms in Punjabi. 


Word 


IPA POS Gloss 
We /kran/ Vv to eat 
We /khan,/ N mine 
33 /Bd/ N push-up 
33 /Bd/ N punishment 
3s /Bdq/ N noise 


6.4.3  Homographs 


Homographs are words with the same spelling (and sometimes 


different 
pronunciations), but different meanings. For example: 
Word IPA POS Gloss 
3st /pora/ N, M brother 
3st /pora:/ Vv to get filled 
Tt /hora/ JJ green colour 
JTS /hora:/ Vv 


to defeat 
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6.4.4 Homophones 


Words with the same pronunciation but different meanings and different spellings. 


For example: 


Word IPA POS Gloss 

ag /kot/ N hard layer 

ag /ka't/ V continuous boiling at low temperature 
urgt /para/ N gap 

urgt /para/ N learner 


6.4.5 Borrowed Words 


A loan word is a word borrowed from a donor language and incorporated into a 
recipient language. Borrowed words are adapted to the sound system and 
grammatical system of the language in which they are borrowed. Like when Punjabi 
borrows word from other languages it changes its gender or other categories 
according to its nature or behaviour. Punjabi language has borrowed extensively from 
Sanskrit, Hindi, Urdu, Persian and English. 4 /f/, 4 /z/, S/f/, 4 /x/, S/y/ used only for 
borrowed words from Perso-Arabic. Such borrowed words have been assimilated in 
Punjabi however some of the native speakers do not pronounce nukta hence both the 


variants are in use. For example: 


Word in Urdu Word variants in Punjabi IPA 
mile} nar /zamana/ 
Ha /dzomana/ 


These words pose a challenge in building PLS for Punjabi language, in deciding 


which pronunciation should be kept in the database, either or both. 
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6.4.6 Acronyms / Abbreviations 


An acronym is a word or name formed as an abbreviation from the initial components 
in a phrase or a word, usually individual letters and sometimes syllables. There are no 
universal standards for abbreviations and the orthographic styling. For some words 
and phrases pronunciation can be expressed quickly and conveniently as a sequence 
of other orthographies. Acronyms / Abbreviations as used in some Punjabi language 


terms is given below: 


am feat uve —- 3feu, 
/pafa/ /vibag/ /padzab/ /pa./ /v1./ /pa./ 
Aged aH fey —- Hatt. 
/sordar/ /kesor/ /stg/ /s./ /ke./ /si./ 
ed - 9 
/vitft{o/ /tfo/ 
M3/ - 3 
/ote /’te/ 
es 2 33 
/ott3/ /°t3/ 
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6.4.7 Multi-Word Expressions (MWEs) 


Multiword expressions (MWEs) are expressions which are made up of at least 2 
words and which can be syntactically and/or semantically idiosyncratic in nature. 


These act as a single unit for linguistic analysis e.g. 


urSt ystectret ufenrs 


/p3dgabi/ /junivorsti/ /potrala/ 


Such language specific data of all possible MWEs also needs to be encoded in PLS. 
The value of this attribute can be “NER” for encoding this type of data. The 
corresponding abbrevaiation can be encoded using <alias> element. These value can 
be defined suitabily for example “echo” or “duplicate” for encoding echo words and 


duplicate words. The examples will be covered in the next chapter. 


6.5 Conclusion 


The data covering the Morpho-syntactic features of Punjabi as elaborated in this 
chapter need to be encoded in the PLS to get prosodically rich PLS. The word list of 
unique words in Punjabi from major POS categories such as Noun, Verb, Adjective, 
Adverb and other granular features may be collated along with the variations for 


developing phonologically rich PLS data. 
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Chapter 7 
Prosodic Lexical XML Database-PLS Framework, Rules and Sample Data 


7. Introduction 


Pronunciation Lexicons are of critical importance in the development of speech 
technology for a language. They represent the interface between the interpretation 


and analysis of speech. 


Language 
Understeniing Context 


ry Interpretation 


‘ Be OTM Tone Racoon r 
>a 
nd 
Pronunciation Lexicon 
Specification (PLS) 


/ 
5 


~~, 


Language 
Generation 


Fig 7/1: Interface between the Interpretation and Analysis of Speech 


In text-to-speech (TTS) synthesis, for example, phonemic transcriptions of the 
pronunciations of words help determine the selection of the acoustic models for 
generating the targeted waveform. The Automatic Speech Recognition (ASR) engine 
developed based on Speech Recognition Grammar Specification (SRGS) uses PLS to 
leverage multiple pronunciations of words and phrases. PLS entries are also applied 
to the graphemes inside SRGS grammar rules to convert them into the phonemes to 


be recognized. 


In Indian languages, Part of Speech (POS) plays an important role in pronunciation as 
discussed in chapter 6. The XML schema needs to be evolved which will help in 
capturing the language specific morphological features in PLS. The proposed XML 


design will also be targeted towards search optimization of PLS data. 
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7.1 Punjabi Lexicon 


Punjabi lexicon is mainly composed of Tadbhavas and use of Tatsama words is very 
limited. The borrowed words pronunciation is adapted by the Punjabi speaker as 
discussed in section 6.4.5. Punjab being an agricultural state, the vocabulary is rich in 
this domain whereas vocabulary of science and technology is not so much developed. 


Punjabi has inflectional morphology as discussed in section 6.2. Punjabi singular 


nouns abundantly use “I /-a/ as suffix and this is indicative of the major use of 


masculine gender. It is also used in conjunction with singular form of verb and verb- 


adjective. The corresponding feminine suffix is & /-i/. 


The tone is phonemic and has been discussed in section 6.3.1. There is only single 


tone in a word and exhibits on the nucleus of the syllable containing toneme or 


consonant / h/ conjuncts of J /fi/. The frequency of use of short vowels i.e. fF /1/, 8 


/o/ is very less. Among long vowels, use of % /e/ and *f /9/ is less. Punjabi vocabulary 


contains monosyllabic and polysyllabic words however the frequency of disyllabic 
words is maximum. Many monosyllabic words end in long vowels. Use of dipthongs 
is frequently found in Punjabi. Four to five vowels can get aglutted to a verb and are 


commonly found in the language. 


7.2 Current Framework for Pronunciation Lexicon Specification (PLS 1.0) 


The current version of PLS may be referred as base line specification as it addresses 


the requirements of Latin script based languages. 


The specification covers the multiple pronunciations and multiple orthography in the 
XML structure at the lexicon level thus providing the flexibility of creating language 


specific PLS documents. 
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Elements Attributes Description 


version 


xml:base 


<lexicon> xmins root element for PLS 
xml:lang 
alphabet 

<meta> element containing meta data 


<metadata> element containing meta data 


xml:id the container element for a single lexical 


<lexeme> 
role 


<grapheme> 


prefer contains pronunciation information for 


entry 


contains orthographic information for 


a lexeme 


ub 


<phoneme> 
alphabet a lexeme 


contains acronym expansions and 
<alias> Prefer a 
orthographic substitutions 


contains an example of the usage for 


<example> 
a lexeme 


Table 7/1: Markup Language Definition of PLS 1.0 


It only covers segmental features of the language. There is no provision to cover 
morphological, syntactic and semantic information associated with pronunciations 
(such as word stems, inter-word semantic links, prosody etc.), hence the research 
undertaken has addressed these additional language specific requirements in this 


context and proposed a new framework. 
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7.3. Proposed Framework for Pronunciation Lexicon Specification for 


Punjabi Language (PLS 2.0) 
The main objective of the research undertaken has been: 


e Adaptation of the W3C PLS 1.0 for evolving a framework for capturing Punjabi 
language phonological features. 

e Corroboration of the major linguistic aspects through analytical study of recorded 
speech signals for Punjabi Language. 

e Identification of the challenges for designing of web based Machine-Readable 
Pronunciation Lexicon Specification in XML. 


e Design of new lexeme elements to incorporate identified features. 


The supra-segmental features of Punjabi language have been experimentally 
examined using recorded speech samples and reported in the previous chapters. 


Based on these findings, W3C PLS 1.0 has been augmented as discussed here. 
7.3.1 Addition of New XML Tags/Attributes 


The co-rrelation of Morpho-Syntactic features with lexical representation and its co- 
articulation has been discussed in chapter 6. Based on these findings, new xml 
elements/attributes in yellow colour are proposed for addition in the existing PLS 1.0 


as given in the table below: 


Elements Attributes Description 
version 
xml:base 
: xmin 
<lexicon> : root element for PLS 
xml:lang 
alphabet 
xml:script 
name _ 
’ element containing meta 
<meta> http-equiv 
data 
content 
neadais element containing meta 
data 
xml:id the container element for a 
<lexeme> : ; 
role single lexical entry 
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Elements 


Attributes 


Description 


<rootword> 


Container element for a 
rootword that contains 
nested derived root words 
with their prefixes and 
suffixes information 


<stem> 


Container elements for 
derivational words 
containing affixes of the 
root word 


<grapheme> 


Origin, pos, pre-fix, MWE , 


meaning 


Contains orthographic 
information for a lexeme, 
its origin and it’s Parts-of- 
speech label, Pre-fix and 
multi word expression viz 
MWE, meaning if any. 
Origin attribute will 
contain ISO 639-3 code of 
the language from which 
the word has been 
borrowed. 

The standard POS tagset 
will be referred as “BIS” 


<suffix> 


Element contains all the 
suffixes of the particular 
root word that may be 
nested 


<inf> 


Container contains all the 
inflections of a particular 
stem 


<phoneme> 


prefer 
alphabet 


contains pronunciation 
information for a lexeme 


<alias> 


Prefer 


contains acronym 
expansions and 
orthographic substitutions 


<example> 


contains an example of the 
usage for a lexeme 


Table 7/2: XML Structure of PLS 2.0 Framework 


“Script Attribute” of <lexicon>: 


Punjabi is written in two scripts i.e. Gurmukhi script (used in Punjab, India) or 


Shahmukhi script, a Perso-arabic script (used in Punjab, Pakistan). Although the 
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scope of the thesis is limited to Gurumukhi script yet it will be appropriate to add 
script attribute in the lexicon to cater to the users of both the scripts to keep the 
framework resilient. The script values for these languages can be encoded in the PLS 
lexicon, which is a four-letter code as per ISO: 15924 “Codes for the representation 
of names of scripts”. The code value for Gurmukhi is “Guru” and the code value for 
Shahmukhi is not yet assigned in the standard. The xml: lang tag is already 
provisioned in the PLS, code value of “Pan” will be encoded in the sample PLS data 


as per ISO: 639-3 “Codes for the representation of names of languages”. 
Element <rootword> 


It is a container element for a rootword and all other word inflections. The 
<rootword> element contains one <grapheme> element and _ corresponding 
<phoneme> element. The <rootword> element forms multiple orthographies and 


corresponding pronunciations using affixes. 
Origin Attribute 


There are many borrowed words as discussed in chapter 6. The origin attribute 
contains the information of the language from which the word has been borrowed and 


will be used only for borrowed words. 
POS Attribute 


It is important to encode POS information for each lexeme viz rootword and its 
inflected words. The Standrad POS labels will be used as per Annexure I 
(appendix D) of Chapter 6 to encode POS attribute for each lexeme and the tagset 
will be referred as “BIS”. 


Prefix Attribute 


The words generated from the root with addition of pre-fix will also be entered as 
lexemes within the rootword container however pre-fix attribute will be added with 


it’s <grapheme> and <phoneme> elements. 


214 


Suffix Element 


The words generated from the root with addition of suffix will also be entered as 


lexemes within the rootword container and suffix element will be added. 
Multi-Word Expression (MWE) Attribute 


The combination of two or more words which conveys specific information needs to 
be encoded as use of such words is very common as discussed in section 6.4.7. This 
attribute will also be used for encoding echo words, duplicate words, idioms/ 


proverbs, compound words etc. 


7.4 Sample PLS Data in Conformance with PLS 2.0 Framework 


Punjabi morphology is highly inflectional as discussed in section 6.2. Verbs have 
maximum inflections. There are some words which are used both as native and 
borrowed. The linguistic variations as discussed in the previous chapters need to be 
captured in the PLS data for complete coverage of the language hence XML of 


representative examples is given in the following sections: 


7.4.1 Verb/ Noun od /kar/ 


As Tadbhava, it is native word of Punjabi and is used as a verb and as Tatsama, it is a 
word borrowed from English and used as a noun. Samples of lexicon xml are given 


below: 


ad /kar/ - verb has 10 inflections viz 3 prefixes and 7 suffixes 


<?xml version="1.0" encoding="UTF-8"?> 
<lexicon version="1.0" 


xmlIns="http://www.w3.org/2005/01/pronunciation-lexicon" 
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xmlns:xsi="http://www.w3.org/200 1/XMLSchema-instance" 
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 


http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd" 


alphabet="ipa" xml:lang="pan"> 
<lexeme> 
<rootword> // native root word /kar/ as verb starts here 


<grapheme pos=“BIS: V_VM”> a'd </grapheme> 


<phoneme> /kar/ </phoneme> 
<stem> // stems of the root word /kar/ start here 


<inf> // inflections of native root word /kar/ using prefix start here 


<grapheme prefix=“"W” pos=“BIS: N_NN”> “ald </grapheme> 


<phoneme> /9'kar/ </phoneme> 


<grapheme prefix=“"fY” pos=“BIS: N_NN”> “{fGad </grapheme> 


<phoneme> /adi' kar/ </phoneme> 


<grapheme prefix=“2” pos=“BIS: JJ’> 8a"d</grapheme> 


<phoneme> /be'kar/ </phoneme> 
</inf> // inflections of native root word /kar/ using prefix end here 


<suffix> // suffixes of the native root word /kar/ start here 
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<inf> // inflections of native root word /kar/ using suffixes starts 


here 


<grapheme> a'JdId </grapheme> 


<phoneme> /kar' gar/ </phoneme> 

<grapheme MWE=“compound”> a™J-He </grapheme> 
<phoneme> /kar-a'mod/ </phoneme> 

<grapheme MWE=“compound”> &™d-Az </grapheme> 
<phoneme> /kar-se'va/ </phoneme> 


<grapheme MWE=“compound”> a'd-aAdedil </grapheme> 


<phoneme> /kar-karda' gi/ </phoneme> 


<grapheme MWE=“compound”> ad-fHEHS </grapheme> 


<phoneme> /kar-x1do'mot/ </phoneme> 


<grapheme MWE=“compound”> ad-HHfId </grapheme> 


<phoneme> /kar-moxtt'ar/ </phoneme> 


<grapheme MWE=“compound”> a"d-fed"d </grapheme> 


<phoneme> /kar-v1'har/ </phoneme> 
</inf> // inflections of native root word /kar/ ends here 
</suffix> // suffixes of native root word /kar/ end here 
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</stem> // stems of native root word /kar/ end here 
</rootword> // native root word /kar/ ends here 


<lexicon> 


+---- grapheme : aa 
H @pos=BIS: V_VM 


@ phoneme : /kar/ 


@ grapheme -: »era 
@prefix=4 
@pos=BIS:N_NN 


@ phoneme : /e'kar/ 


@ grapheme : »faers 
@prefix="fa 
@pos=BIS:N_NN 


i---A phoneme : /edi'kar/ 
‘@ grapheme : aaa 
@prefix=S 


Fig 7/2: Tree view by XML Reader 


A sample xml entry of lexicon for a rootword A'd /kar/ as borrowed from English, 
used as noun in Punjabi language having 2 inflections viz suffixes 
<lexeme> 


<rootword> //Borrowed root word /kar/ starts here 


<grapheme origin= “eng” pos=““BIS: N_NN”> ad </grapheme> 


<phoneme> /kar/ </phoneme> 


<stem> //stems of borrowed root word /kar/ start here 
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<suffix> //suffixes of Borrowed word /kar/ starts here 


<inf> //inflections starts here 


<grapheme> ad </grapheme> 


<phoneme> /ka'r6/ </phoneme> 


<grapheme> ad" </grapheme> 


<phoneme> /ka'ra/ </phoneme> 


</inf> //inflections borrowed root word /kar/ end here 
</suffix> // suffixes of Borrowed word /kar/ viz car end here 
</stem> //stems word of borrowed root word /kar/ end here 
</rootword> // Borrowed root word /kar/ ends here 
</lexeme> 


Tree View Result: 


© texicon 
( ) 
b---f S texeme } 
{ © rootworg J 
i @ grapheme - awa 
@origin=-eng 
@pos=81S: N_NN 


+--- A phoneme -: /kar/ 


(stem 7 


:---(Smr) 
i --. grapheme -: 
}----—? phoneme : 


Hi ---~@ grapneme : 


Fig 7/3: Tree view by XML Reader 
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7.4.2 Pronouns 


The hierarchy of Pronouns (6 layers) as defined in the Standard POS Tag Set has 


been implemented in xml as given below: 


<lexeme> 


<grapheme pos=“BIS: PR_PRP’> A </grapheme> 


<phoneme> /mé/ </phoneme> 


<grapheme pos=“BIS: PR_PRF’> “™Ue" </grapheme> 


<phoneme> /apa'na/ </phoneme> 


<grapheme pos=“BIS: PR_PRL”> fH </grapheme> 


<phoneme> /d3i1s/ </phoneme> 


<grapheme pos=“BIS: PR_PRC’”> “UH </grapheme> 


<phoneme> /a'pas/ </phoneme> 


<grapheme pos=“BIS: PR_PRQ’> ae </grapheme> 


<phoneme> /ko'd6/ </phoneme> 


<grapheme pos=“BIS: PR_PRI’’> aet </ grapheme> 


<phoneme> /koi/ </phoneme> 


</lexeme> 
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Tree Miew Result. 


| & texicon | 


+---{ & texeme 


grapnhneme - + 
@pos=-Sis: PR_PRP 


Phonmeme - = 


grapheme - =e 
@pos=8SiIS: PR_PRE 


Phonmeme - saps na 


gQrapheme - fw, 
@pos=SisS: PR_PRL 


Phonmreme - /s/asrs/ 


gQrapnheme - »ma 
@pos=SiIS: PR_PRC 


Phoneme - /a pes’ 


grapnheme : == 
@pos=8IS: PR_PRO 


- #keds6s 


grapnheme :- 2a 
pos=SIS: PR PRE 


Fig 7/4: Tree view by XML Reader 


7.4.3 Demonstrative Words 


The hierarchy of Demonstrative words (4 layers) as defined in the Standard POS Tag 


Set has been implemented in xml as given below: 


<lexeme><grapheme pos=“‘BIS: DM_DMD”> fed </grapheme> 


<phoneme> /i/ </phoneme> 


<grapheme pos=“BIS: DM_DMR”> # </grapheme> 


<phoneme> /d30/ </phoneme> 


<grapheme pos=“BIS: DM_DMQ”> & </grapheme> 


<phoneme> /kon/ </phoneme> 
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<grapheme pos=“BIS: DM_DMI”> faH</grapheme> 


<phoneme> /kis/ </phoneme> 
</lexeme> 


Tree Wtew Peesulht: 


| lexicon | 


a, — a 


gQrapheme |: fsx, 
@pos=-BIsSs: DM_pDMmMm 


Phoneme 


gqgrapheme : 4 
@pes=-81sS: oOM_DMR 


Phoneme =: #fdsaor 


qQrapheme : f= 
@pos=BIs: DM_pDMOQ 


PHoneme = #kon, 


gQgrapheme : fama 
@pos=8iIs: DM_DMI 


Phoneme — /krsy 


Fig 7/5: Tree view of Demonstrative Words by XML Reader 


7.4.4 Verb Wa /kay/ 


A sample xml entry of lexicon for a rootword WZ /kat/, verb containing toneme 4 


/g*/ having nine stems, total 41 inflections out of which there are 4 prefixes. The 


causative form of YZ /kdz/ i.e. YE" /'karva / has been encoded as a separate root 
word with 9 stems and total 37 inflections: 
<lexeme> 


<rootword> 
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<grapheme pos=“BIS: V_VM’> Wz </grapheme> 


<phoneme> /kat/ </phoneme> 


<stem> 


<grapheme pos=“BIS: V_VM”> wf“ </grapheme> 


<phoneme> /karia/ </phoneme> 
<inf> 


<grapheme prefix=“We”’> M=EwyfMr </grapheme> 


<phoneme> /anka'tia/ </phoneme> 

<grapheme prefix=e’>wewfani" </grapheme> <phoneme> /anks'ia/ 
</phoneme> 

<grapheme prefix=“WE”>WE4yat </grapheme>  <phoneme> /anks ‘ti/ 
</phoneme> 

<grapheme prefix=WE">WeyaMy </grapheme> <phoneme> /anks'tia/ 
</phoneme> 


</inf> </prefix> 


<suffix> 

<inf> 

<grapheme> Wa </grapheme> <phoneme> /kdte/ </phoneme> 
<grapheme> 4a </grapheme> <phoneme> /kdri/ </phoneme> 
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<grapheme> Wat </grapheme> <phoneme> /karia/ </phoneme> 


<grapheme> wf </grapheme> <phoneme> /kdria/ </phoneme> 


</inf> </suffix> 
</stem> 


<stem> 


<grapheme pos=“BIS:V_VM’> Wa" </grapheme> 


<phoneme> /kdrada/ </phoneme> 
<suffix> 
<inf> 
<grapheme> Wat </grapheme> <phoneme> /katade/ 
</phoneme> 
<grapheme> Yet </grapheme> <phoneme> /katadi/ 
</phoneme> 
<grapheme> YsemM </grapheme> <phoneme> /karadia/ 
</phoneme> 
<grapheme> Yate </grapheme> <phoneme> /karadia/ 
</phoneme> 
</inf> </ suffix > 

</stem> 


<stem> 
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<grapheme pos=“BIS:V_VM”> wad </ grapheme> 


<phoneme> /kdtad6/ </phoneme> 
<suffix> 
<inf> 
<grapheme> wate </ grapheme> <phoneme> /kayadio/ 
</phoneme> 
<grapheme> Yates </grapheme> <phoneme> /katadio/ 
</phoneme> 
<grapheme> YFel </grapheme> <phoneme> /karadio/ 
</phoneme> 
</inf> </suffix > 
</stem> 


<stem> 

<grapheme pos=“BIS:V_VM”> 4zart </grapheme> 

<phoneme> /kdrana/ </phoneme> 

<suffix> 

<inf> 

<grapheme> 43 </grapheme> <phoneme> /katane/ 


</phoneme> 
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<grapheme> 4aot </grapheme> <phoneme> /kareni/ 


</phoneme> 
<grapheme> YxoMi </grapheme> <phoneme> /karania/ 
</phoneme> 
<grapheme> 42S </grapheme> <phoneme> /kstan/ </phoneme> 
<grapheme> wae </ grapheme> <phoneme> /karano/ 
</phoneme> 


</inf> </ suffix> 
</stem> 


<stem> 


<grapheme pos=“BIS:V_VM’> Wat </grapheme> 


<phoneme> /kata/ </phoneme> 
<inf> 


<grapheme prefix="We”> MEWS </grapheme> 


<phoneme> /on kate/ </phoneme> 

</inf> 

<suffix> 

<inf> 

<grapheme> wale </grapheme> <phoneme> /kdtie/ </phoneme> 
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<grapheme> Wa </grapheme> 
<grapheme> we </ grapheme> 
<grapheme> Wa </grapheme> 
<grapheme> 436 </grapheme> 


</inf> </suffix> 


</stem> 


<stem> 


<phoneme> /kat6/ </phoneme> 


<phoneme> /kato/ </phoneme> 


<phoneme> /kate/ </phoneme> 


<phoneme> /katon/ </phoneme> 


<grapheme pos=“BIS:V_VM”> Wad" </grapheme> 


<phoneme> /karaga/ </phoneme> 


<suffix> 

<inf> 

<grapheme> Yad! </grapheme> 
</phoneme> 

<grapheme> Yadl </grapheme> 
</phoneme> 

<grapheme> wadt </ grapheme> 


</phoneme> 


<phoneme> /karage/ 
<phoneme> /karéga/ 
<phoneme> /katoge/ 


<grapheme> Y3dI" </grapheme> 
</phoneme> 

<grapheme> YWHOd! </grapheme> 
</phoneme> 

</inf> </suffix> 


</stem> 


<stem> 


<phoneme> 


<phoneme> 


<grapheme pos=“BIS:V_VM”> Wadit </grapheme> 


<phoneme> /karagi/ </phoneme> 
<suffix> 


<inf> 


<grapheme> Wadi" </grapheme> 


</phoneme> 


<grapheme> Wait </grapheme> 


</phoneme> 


<grapheme> wadimt </ grapheme> 


</phoneme> 


<grapheme> Wait </grapheme> 


</phoneme> 
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<phoneme> 


<phoneme> 


<phoneme> 


<phoneme> 


/katega/ 


/katange/ 


/karagia/ 


/karégi/ 


/kdrpogia/ 


/kategi/ 


<grapheme> YaodIM </grapheme> <phoneme> /kdrangia/ 
</phoneme> 
</inf> </suffix> 


</stem> 


<stem> 


<grapheme pos=“BIS:V_VM”> wate" </grapheme> 


<phoneme> /kdrida/ </phoneme> 

<suffix> 

<inf> 

<grapheme> Wate </grapheme> <phoneme> /kadzide/ 
</phoneme> 

<grapheme>Watet</grapheme> <phoneme> /karidi/ 
</phoneme> 

<grapheme>Yatemi'</grapheme> <phoneme> /kaidia/ 
</phoneme> 

</inf> </suffix> 


</stem> 


<stem> 
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<grapheme pos=“BIS:V_VM’> wt </grapheme> 


<phoneme> /katti/ </phoneme> 


<suffix> 

<inf> 

<grapheme> wat </ grapheme> <phoneme> /ka {i/ </phoneme> 
<grapheme> w4fa@ </grapheme> <phoneme> /k3t10/ </phoneme> 
<grapheme> “Zz </grapheme> <phoneme> /katu/ </phoneme> 


</inf> </suffix> 
</stem> 


</rootword> 


<rootword> 


<grapheme pos=“BIS:V_VM’> Wa" </grapheme> //causative form of verb// 


<phoneme> /k3 tava/ </phoneme> 


<stem> 


<grapheme pos=“BIS:V_VM”> Yae"Ge" </grapheme> 


<phoneme> /kdtavauna/ </phoneme> 
<suffix> 
<inf> 
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<grapheme> YA<"BE </grapheme> <phoneme> 
</phoneme> 

<grapheme> Yae"et </grapheme> <phoneme> 
</phoneme> 

<grapheme> YIe"GEM </grapheme> <phoneme> 
</phoneme> 

<grapheme> YaS'VE </grapheme> <phoneme> 
</phoneme> 

<grapheme> wWaeee </grapheme> <phoneme> 
</phoneme> 

</inf> 

</suffix> 


</stem> 


<stem> 


<grapheme pos=“BIS:V_VM”> weeeer </grapheme> 


<phoneme> /ka tavadda/ </phoneme> 
<suffix> 


<inf> 
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/katavaone/ 


/kdtavaoni/ 


/k3tavaonia/ 


/k3tavaon/ 


/k3tavaond/ 


<grapheme> weet </ grapheme> <phoneme> /kdtavadde/ 
</phoneme> 

<grapheme> Yae"Sut </ grapheme> <phoneme> /kdtavacdi/ 
</phoneme> 

<grapheme> Yae"@emt </ grapheme> <phoneme> /k3tavacddia/ 
</phoneme> 

<grapheme> Yae@tent </ grapheme> <phoneme> /k3tavacddia/ 
</phoneme> 

</inf> </suffix> 


</stem> 


<stem> 


<grapheme pos=“BIS:V_VM’> wage </grapheme> 


<phoneme> /k3 pava6d6/ </phoneme> 

<suffix> 

<inf> 

<grapheme> weeeae </ grapheme> <phoneme> /k3tavaddid/ 
</phoneme> 

<grapheme> wae@feg </ grapheme> <phoneme> /k3tavaddio/ 


</phoneme> 
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<grapheme> weeeHe </ grapheme> <phoneme> /kdtavaddio/ 
</phoneme> 
</inf> </suffix> 


</stem> 


<stem> 


<grapheme pos=“BIS:V_VM”> wees </grapheme> 


<phoneme> /kdtavati/ </phoneme> 

<suffix> 

<inf> 

<grapheme> wert </ grapheme> <phoneme> /katavai/ 
</phoneme> 

<grapheme> Yaete </grapheme> <phoneme> /katavato/ 
</phoneme> 

<grapheme> wWaee </grapheme> <phoneme> /ka3tavau/ 
</phoneme> 

</inf> </suffix> 


</stem> 


<stem> 
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<grapheme pos=“BIS:V_VM”> Waeten"</grapheme> 


<phoneme> /kdravaia/ </phoneme> 

<suffix> <inf> 

<grapheme> YWae"e </grapheme> <phoneme> 
</phoneme> 

<grapheme> Yae"et </grapheme> <phoneme> 
</phoneme> 

<grapheme> Yee </grapheme> <phoneme> 
</phoneme> 

<grapheme> YaeeI" </grapheme> <phoneme> 
</phoneme> 

</inf> 

</suffix> 

</stem> 


<stem> 


<grapheme pos=“BIS:V_VM”> WaeeteT </grapheme> 


<phoneme> /kdravaida/ </phoneme> 
<suffix> <inf> 
<grapheme> WaeTete </grapheme> <phoneme> 


</phoneme> 
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/katovae/ 


/katevai/ 


/katavaia/ 


/kdpavaia/ 


/k3tavaide/ 


<grapheme> YWae"etet </grapheme> <phoneme> 
</phoneme> 

<grapheme> Yaeete Mt </grapheme> <phoneme> 
</phoneme> 

</inf> </suffix> 


</stem> 


<stem> 


<grapheme pos=“BIS:V_VM”> Wae"e"</grapheme> 


<phoneme> /ka pavava/ </phoneme> 

<suffix> 

<inf> 

<grapheme> Yae"ete </grapheme> <phoneme> 
</phoneme> 

<grapheme> Yao" </grapheme> <phoneme> 
</phoneme> 

<grapheme> Yae"G </grapheme> <phoneme> 
</phoneme> 

<grapheme> YWae"e </grapheme> <phoneme> 


</phoneme> 
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/k3tavaidi/ 


/kdcavaidia/ 


/katevaie/ 


/katevaé/ 


/katavao/ 


/katevae/ 


<grapheme> YASS </grapheme> <phoneme> 
</phoneme> 
</inf> </suffix> 

</stem> 


<stem > 


<grapheme pos=“BIS:V_VM’> Wae"etdI" </grapheme> 


<phoneme> /kstavavaga/ </phoneme> 


<suffix> <inf> 


<grapheme>Wae"e"dl</grapheme> <phoneme> 
</phoneme> 
<grapheme>Yae"edl'</grapheme> <phoneme> 
</phoneme> 
<grapheme>Yae"@dl </grapheme> <phoneme> 
</phoneme> 
<grapheme>Wae'edI </grapheme> <phoneme> 
</phoneme> 
<grapheme>Wae"Sedl </grapheme> <phoneme> 
</phoneme> 


</inf> </suffix> 


</stem> 
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/kdtavaon/ 


/k3tavavage/ 


/karavaéga/ 


/k3tavaoge/ 


/kdtavaega/ 


/k3tavaonge/ 


< stem > 


<grapheme pos=“BIS:V_VM”> Yae"etat </grapheme> 


<phoneme> /kdtavavagi/ </phoneme> 


<suffix> <inf> 


<grapheme> Yae"etgimt </grapheme> <phoneme>/kdravavagia/ 
</phoneme> 

<grapheme> Yaeedit </grapheme> <phoneme> /kdrtavaégi/ 
</phoneme> 

<grapheme> Yae"@diMi </grapheme> <phoneme> /karavadgia/ 
</phoneme> 

<grapheme> Yae"edit </grapheme> <phoneme> /k3tavaegi/ 
</phoneme> 

<grapheme> YIe'CedIM </grapheme> <phoneme> /k3cavavngia/ 
</phoneme> 


</inf> </suffix> 


</stem> 


</rootword> 
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7.4.5 Adjective SI"Z /gara/ 


A sample xml entry of lexicon for a rootword dIZ" /ga'pa/, adjective tonal word- 


conjunct of /fi/, is having 2 inflections of the root word and 1 inflection of the stem 


Wat /ga' ti/. 

<lexeme> 

<rootword> 

<grapheme pos="BIS:JJ"> d"3"</grapheme> 
<suffix> 

<inf> 


<grapheme> re </grapheme> 


<grapheme> artayir</ grapheme> 


</inf> 
</suffix> 


<stem> 

<grapheme pos="BIS:JJ"> rat </grapheme> 
<suffix> 

<inf> 

<grapheme> rar </grapheme> 


</inf> 
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<phoneme> /ga'ta/ </phoneme> 


<phoneme> /ga'té/ </phoneme> 


<phoneme>/ga'ia/ </phoneme> 


<phoneme>/ga 'ti/ </phoneme> 


<phoneme>/ga '{ia/ </phoneme> 


</suffix> 
</stem> 
</rootword> 


</lexeme> 


@ grapheme : ara 


; @pos=B81IS:33 
---# phoneme : ‘ga [af 


a grapheme -: aq 


i --“@ phoneme : /ga'te/ 


i---./ grapheme :- owt 


! 
. 


@ grapheme : oat 


@pos=BIS:33 


---.< phoneme : /ga‘qi/ 


Fig 7/6: Tree view by XML Reader 
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7.4.6 Adverb S'Jd<7d /bahor var/ 


A sample XML entry of lexicon for a word SII /bahor'var/: 


<lexeme> 


<grapheme pos=“BIS:RB”’> B&dJdIeJ </grapheme> <phoneme> /bahor'var/ 
</phoneme> 


</lexeme> 


7.4.7 Postposition 6°S /nal/ 


A sample XML entry of lexicon for a word S'S /nal/, postposition: 


<lexeme> 


<grapheme pos=“BIS:PSP”> 4S </grapheme> <phoneme> /nal/ </phoneme> 


</lexeme> 


7.4.8 Conjunction “13 /a'te/ 


A sample XML entry of lexicon for a word W3 /a'te/, conjunction: 


<lexeme> 


<grapheme pos=“BIS:CC”> “3 </grapheme> <phoneme> /9'te/ </phoneme> 


</lexeme> 
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7.4.9 Multi-Word Expressions 
Sample xml data of echo words: 


<lexeme> 

<grapheme MWE="echo"> @3-Us</grapheme> <phoneme> /udd-pudd/ 
</phoneme> 

<grapheme MWE="echo"> @Ha-Una</grapheme> <phoneme> /u'd3d3or- 
po dzdzor/ </phoneme> 


</lexeme> 


Tree View Result: 


@ grapheme : G3-us 
@MbIE=echo 


@ phoneme : /udd-pudd/ 


@ grapheme : Gaa-unz 
@MbIE=echo 


@ phoneme : fy d3d3er-pu d3d3er/ 


Fig 7/7: Tree view by XML Reader 
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Sample xml entry of duplicates: 
<lexeme> 
<grapheme MWE=“duplicate’> 3™3-3"a</grapheme> <phoneme>  /tar-tar/ 
</phoneme> 
<grapheme MWE=“duplicate”> fzu-fzu</grapheme> <phoneme> /'torip-'torip/ 
</phoneme> 
</lexeme> 
A sample xml entry of abbreviations and Cardinal-ordinal pair: 
<lexeme> 
<grapheme origin= “eng” pos=“BIS: N NN” > 3'adcd </grapheme> <alias> 37. 
</alias> 
<phoneme> /dak'tar/ </phoneme> 


<inf> Ae </inf> <phoneme> /dakto'ra/ </phoneme> 


</lexeme> 
<lexeme> 

<grapheme> f¥X </grapheme> <alias> 1 </alias> <phoneme> /Ikk/ 
</phoneme> 


</lexeme> 
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: &@ grapheme -: s. 
@pos=8iIS: N_NN 


& alias -_saca 
= phoneme -: /dak ter/ 
@int: sas 


‘= phoneme - /qakte ra/ 


+---{ @& texeme | 


&# grapheme -:1 
@ alias - fa 


& phoneme -: /ikk/ 


Fig 7/8: Tree view by XML Reader 


7.4.10 Homographs 
Sample XML Entry of homographs: 


<lexeme> 


<rootword> 


<grapheme pos=“JJ’> Jd™</grapheme> <phoneme> /ha'ra/ </phoneme> 


<suffix> 
<inf> 
<grapheme> Jd </grapheme> <phoneme> /ha're/</phoneme> 


<grapheme> fait </grapheme> <phoneme> /ha'r1a/</phoneme> 
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<grapheme> Jot </grapheme> <phoneme>/ho 'ri/ </phoneme> 


<grapheme> JdMt </grapheme> <phoneme> /ho 'ria/</phoneme> 


</inf> 
</suffix> 


</stem> 


<stem> 

<grapheme pos=“MWE”> Jd'-3d </grapheme> <phoneme> /ho'ra-pd'ra/ 
</phoneme> 

<suffix> 

<inf> 

<grapheme> Jd-3d </grapheme> <phoneme> /ha're-pa're/ 
</phoneme> 


<grapheme> Jfai-atMt </grapheme> <phoneme> /ho'r14-pd 'r1a/ </phoneme> 


<grapheme> Jdl-3ct </grapheme> <phoneme> /ha'ri-p3'ri/ </phoneme> 


<grapheme> JdMi-sdMI </grapheme> <phoneme> /ho'rid-ps 'ria/ </phoneme> 


</inf> 
</suffix> 


</stem> 
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<grapheme pos=“BIS:V_VM”> Jd"</grapheme> <phoneme>/ho 'ra:/ 
</phoneme> 

<suffix> 

<stem> 

<grapheme pos=“BIS:V_VM”> Jd"@e" </grapheme> <phoneme>  /horav'na/ 
</phoneme> 

<suffix> 

<inf> 


<grapheme> Jd'BE </grapheme> <phoneme> /horavu'ne/ </phoneme> 


<grapheme> Jd"Get </grapheme> <phoneme>/ horav ‘ni/ 


</phoneme> 


<grapheme> JS'GEMT </grapheme>  <phoneme> /horau 'nia/ </phoneme> 


<grapheme> Jd"GE </grapheme> <phoneme> /ho'raon/ 
</phoneme> 

<grapheme> ISGE </grapheme> <phoneme> /horav'16/ 
</phoneme> 

</inf> 

</suffix> 


</stem> 
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<stem> 

<grapheme pos=“BIS:V_VM’> Jeo </grapheme> <phoneme> 
/hora6d'da/ </phoneme> 

<suffix> 

<inf> 

<grapheme> Ise </ grapheme> <phoneme> /horad'de/ 
</phoneme> 

<grapheme> Jae </ grapheme> <phoneme> /horad'di/ 
</phoneme> 


<grapheme> Jagat </ grapheme> <phoneme> /hora6 ‘dia/ </phoneme> 
<grapheme> Ia Stent </ grapheme> <phoneme> /horad' d1a/ </phoneme> 


</inf> 
</suffix> 


</stem> 


<stem> 

<grapheme pos=“BIS:V_VM’> gage </grapheme> <phoneme> 
/harad'dd/ </phoneme> 

<suffix> 


<inf> 
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<grapheme> ISSUE </ grapheme> <phoneme> /hara6' did/ 
</phoneme> 

<grapheme> Ja ete </ grapheme> <phoneme> /horad' dio/ 
</phoneme> 

<grapheme> ITE </ grapheme> <phoneme> /harad' dio/ 
</phoneme> 

</inf> 

</suffix> 

</stem> 

<stem> 

<grapheme pos=“BIS:V_VM”> Jog </grapheme> <phoneme> 
/ho'rati/ </phoneme> 

<suffix> 

<inf> 


<grapheme> Jott </ grapheme> <phoneme> /ho 'rai/ </phoneme> 


<grapheme> JS TES </grapheme> <phoneme> /ha'rato/ 
</phoneme> 
<grapheme> Js </grapheme> <phoneme> /ho'rau/ 
</phoneme> 


</inf> 
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</suffix> 


</stem> 


<stem> 


<grapheme pos=“BIS:V_VM”> Jd kT </grapheme> 


/ha'rata/ </phoneme> 
<suffix> 
<inf> 


<grapheme> Jd </grapheme> 


<grapheme> Jd </grapheme> 


<grapheme> JdEM </grapheme> 


<grapheme> Jd </grapheme> 


</inf> 
</suffix> 


</stem> 


<stem> 


<grapheme pos=“BIS:V_VM”> Jae </grapheme> 


/horai' da/ </phoneme> 


<phoneme> 


<phoneme> /horae/ </phoneme> 


<phoneme> /ha'rai/ </phoneme> 


<phoneme> /ha'raia/ </phoneme> 


<phoneme> /ha'raia/ </phoneme> 


<phoneme> 


<suffix> 
<inf> 
<grapheme> Jae </grapheme> <phoneme> /horai' de/ 


</phoneme> 


<grapheme> Jd tet </grapheme> <phoneme> /horai'di/ </phoneme> 


<grapheme> Jd eTEMI </grapheme> — <phoneme> /horai'dia/ </phoneme> 


</inf> 

</suffix> 

</stem> 

<stem> 

<grapheme pos=“BIS:V_VM”> Jd" </grapheme> <phoneme> 


/hora' va/ </phoneme> 


<suffix> 

<inf> 

<grapheme> JHE </grapheme> <phoneme> /ha'raie/ 
</phoneme> 

<grapheme> Jd </grapheme> <phoneme> /ha'raé/ </phoneme> 
<grapheme> Jd'@ </grapheme> <phoneme> /ho'rao/ </phoneme> 
<grapheme> Jd" </grapheme> <phoneme> /ho'rae/ </phoneme> 
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<grapheme> Jd'GE </grapheme> <phoneme> /hor'aun/ </phoneme> 


</inf> 
</suffix> 


</stem> 


<stem> 

<grapheme pos=“BIS:V_VM’> Jag </grapheme> <phoneme>  /horava'ga/ 
</phoneme> 

<suffix> 

<inf> 

<grapheme> Jd'<'dl </grapheme> <phoneme> /horava' ge/ 
</phoneme> 

<grapheme> Jad </grapheme> <phoneme> /horaé' ga/ 
</phoneme> 

<grapheme> Jd'Gdl </grapheme> <phoneme> /horao’ ge/ 
</phoneme> 


<grapheme> Jd </grapheme> <phoneme> /horae’ ga/ </phoneme> 


<grapheme> Jd'GeEd! </grapheme> <phoneme> /haravon’ ge/ </phoneme> 


</inf> 
</suffix> 
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</stem> 


<stem> 


<grapheme pos=“BIS:V_VM”> Jd'<'dit </grapheme> <phoneme> 


</phoneme> 

<suffix> 

<inf> 

<grapheme> Jd'='dIMI" </grapheme> 
</phoneme> 

<grapheme> Jddit </grapheme> 
</phoneme> 

<grapheme> Jd'@diMt </grapheme> 
</phoneme> 


<grapheme> Jd'dit </grapheme> 


<grapheme> JS'GEdIMI </grapheme> 


</inf> 
</suffix> 
</stem> 
</rootword> 


</lexeme> 
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/horava' gi/ 


<phoneme> /horava' gia/ 
<phoneme> /horaé' gi/ 
<phoneme> /horao’ gia/ 


<phoneme> /harae’ gi/ </phoneme> 


<phoneme> /horaun gia/ </phoneme> 


75 Conclusion 


Phonetically rich PLS data in conformance with PLS 2.0 framework covering 
segmental as well as suprasegmental features such as stress, tone, gemination, 
nasalization etc. can be developed based on the representative samples as described 


above. 
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Appendix A — WORD LISTS 


Chapter 3 — Word list of Tonemes 


Monosyllabic 

S.No. Words IPA transcription Meaning 
if wo /kar/ Home 
2 WA /kus/ Bribe 
3 & wy /dég/ Pace 
4. shy /tag/ Anxiety 
a: vy /pig/ Swing 
6. Gy /ag/ Doze 
a =f /d33g/ Thigh 
8. HRY /mag/ Name of month 
9. va /tfag/ Foam 
10. Ef) /tfath/ Lie 
le _— ‘ sad3/ Partnership 
12, ay /bod3/ Weight 
13. Hy /mad3d3/ Buffalo 
14. @y /Sd3/ Otherwise 
LD EY /tfid3/ Ring for fighting game 
16. ny /d33d3/ Marriage procession 
7, gig /ba d3/ Unproductive woman 
18. es /tid/ Belly 
19. ed /tér/ Heap 
20. revs /tol/ Drum 
ais oar /t5g/ Method 
22: at /tai/ Two and a Half 
23, eet /tai/ Back 
24. Are /sid/ Nosy 
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S.No. Words IPA transcription Meaning 
25, He /soq/ Trunk/dry ginger 
26. aS /ton/ Money 
2: UZ /tar/ Upper part of body 
28. ao /j6dd/ War 
29. ag /kSd/ Wall 
30. sy /pokk*/ Hunger 
31. 32 /pud/ Female Pig 
32: a /d3ib/ Tongue 

Table | 
Disyllabic 

S.No. Words IPA transcription Meaning 
i upg /koda/ Horse 
2 wet /o ti/ Watch 
3 ujAr /kdssa/ Push with hip 
4. wrest /kahi/ Grass cutter 
5 ujst /kodi/ Trick/Problem 
6 fifgt /kiggi/ Hiccup caused by 

crying 

7. wer /kona/ Cunning 
8. ujgr /keéra/ Circumference 
9. upet /y li/ Lazy 
10. ujer /kota/ Cramming 
je faurR /nigas/ Warmth 
12. news /ankdt/ Crude 
13. fours /tfigat/ To cry out 
14. exey /ddzaby g/ Not in shape 
13. few /nigga/ Warm 
16. Gyr /ég al Prominent 
17. sult iS gi/ Comb 
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S.No. Words IPA transcription Meaning 
18. Sut /bdggi/ Horse driven cart 
19; By /Seu/ Small 
20. yt /t fSca / Flag 
24; BZ /tfaru/ Broom 
22; fsaq /tfitak/ Scold 
23. Bind /t fadgor / Anklet 
24. Ria Possessing something 
Bat Afoli/ in one’s cloth worn 
25% Beg /tfat/ Not vey smart 
26. sense of enjoyment 
gar /tfata/ while in a moving 
vehicle/on swing 
27; SNES al Jot{Slk / Bold 
28. AST Isb& al Suggestion 
2: ASST /sudzai/ To suggest 
30. foHfsH /romtfim/ In slow motion 
31. Jy /hddzu/ Tears 
32, as Ibisd30/ To put question 
33) Ast /s6d3i/ Insight 
34. HST /mad3i/ Language in Punjab 
35. Hy /d33dzu/ Sacred thread 
36. w= /takkan/ Cover 
cys i Pou /tilla/ Loose 
38. wy /taba/ Small Restaurant 
39. ort eas Particular Religious 
et Singing Group 
A0. Ares /sid V/ Nosy person 
4]. oes /bidy 1/ Munder 
42, Her /s3cp / Bull 
43, der /hdch / To wear 
44. — Sci / Edge 
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S.No. Words IPA transcription Meaning 
45. vat /tfudi/ To pinch 
46. a /bédda/ Old 
47. Unit /tua/ Smoke 
48. fice /tian/ Attention 
49. ost /tobi/ Washerman 
50. TOA /tor {/ Bow 
Sls er /t5ch / Profession 
D2: Wate /tora d/ Rich Person 
D2: UdH /tarom/ Religion 
54, et /toni / Navel 
55. ffug /1dddr/ This side 
56. yos /prodan/ Chief 
a7: HOS /moad t/ Sweet 
58. wor /3dda/ Half 
39. fad /sidda/ Simlpe 
60. FHitt /s5d / Joining 
61. yr /k'ada/ Ate 
62. aa /gada/ Donkey 
63. fia /gidda/ Type of Ladies’ Dance 
64. ao /gédda/ Kneaded 
65. sat /po gi/ Simpleton 
66. 3e /padu/ Wanderer 
67. —— /pépit/ Scared 
68. SHH /p sem/ Ash 
69. a= /pid3dzna/ Get wet 
70.| = »wifayrR /abias/ Practice 
a Jisd /gabti r/ Serious 
72. cad /dobbsr/ To make it difficult 
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S.No. Words IPA transcription Meaning 

13. was /gorab/ Pregnant 

74. ws /dorlabb/ Rare 

Ting Stat /nabi/ Navel 

76. eat /toba/ Small Water Pond 
77. east /datba/ Small congested space 

Table 2 
Tri-syllabic 
S.No. Words IPA transcription Meaning 

1. woot /ktrona/ To stare 

2s a /tagona/ To desire 

3. fawgar /nigorta / To swallow 

4. fewer /nigarna/ To sink 

21 UY /pdgura/ Cradle 

6. eguet /oldgana/ Violation 

a Quer /tg ona/ To doze 

8. capers /k'3galna/ To rinse 

9: vRgSr /pdgdrna/ To melt 

10. Boxart /tfontfona/ Sound making toy 
11. Bust /tfSpori/ Hut 

12. Maat /tfagora/ Quarrel 

13. ital /t{Skona/ To bend 

14. se /ridzana/ Getting cooked 
15.) APgteSs /sadgidar/ Partner 

16. ayer /bud3ana/ To guess and answer 
17.) fars"@st /gidzaona/ To make habitual 
18.) @grgar /odgarna/ To spread 

19. feeor /tidora/ Public Announcement 
20. fomsoet /tilkava/ Loose 
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S.No. Words IPA transcription Meaning 
PAD @fget /{ahena/ To fall 
22. vuer /tudana/ To find 
23% — /sddana/ Two chords for pulling 
the bull 
24. Hest /so& la/ Jaggery and Dry ginger 
25.| Geng /hddonsar/ Long life 
26: Jee /hddau/ Durable 
oy a /kadai/ Embroidery 
28. Mee /goad® n/ Neighbour 
29. ger /bodapa/ Old Age 
30. amiret /goad / Neighbour 
eI Geer /t6dala/ Hazy 
SZ nfqgr /3c& ra/ Darkness 
33. Huds /sadaron/ Simple 
34. aut /kddui/ Big sewing needle 
35. aug /kidar6/ From where 
36. aus /g3ddla/ Muddy 
3% sumer /dodia/ Milky white 
38.| SHUT /namtari/ Religious Community 
ao: aunt /badua/ Bonded 
40. seal /pasuti/ Stampede 
4l. a /nibaona/ Cope up 
42. ggst /rdbana/ Sound produced by 
Cow/Bull 
43. Beet /labbana/ To find 
44. fsgs /nirpe/ Unfearful 
Table 3 
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Polysyllabic 


S.No. Words IPA transcription Meaning 
15 wars /kdllukara/ Massacre 
2 AST /somd3adari/ Wisdom 
3 vugyESt /tfoddarpona/ Purposeless leadership 
4 fared /priftatfar/ Corruption 
Table 4 
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Chapter 3 — Word list of Laryngeal Consonant /h/ 


Monosyllabic 

S.No. Words IPA transcription Meaning 
1. WIT /a/ Sigh 
2 HAfa /sé/ Indicating togetherness 
3 ag /k6/ Mountain 
4. Yd /k'a/ Well, irrigation well 
5 = /k®6/ Discomfort, uneasiness 
6 are /ga/ Disorder, spread of harvested crop 

awaiting 
Te od /tfa/ Wish, desire, avidity 
8. =r) / tf?o/ Touch, dab, contact, tap 
9. rtrd /ta/ Fall, defeat, destruction 
10. wel /to/ Back-rest, rest 
ih, Sal /16/ Iron 
12, arg /va/ Wonderful, well-done 
Table 5 

Disyllabic 

S.No. Words IPA transcription Meaning 
1. sag /taba/ Destroyed, ruined, spoiled 
2 sg /tora/ Fright, sudden fear 
3. eH /vosa/ Trust, reliance, faith 
4. feyrg /via/ Marriage, matrimony 
5: WIT Jer /ala/ Superior, excellent 
6. fegai /éna/ These 
Me dea /hoka/ Sigh 
8. JAY /hasab/ According to rules, law 
9. salto /hogar/ Excreta of houseflies 
10. rap /hazom/ Digested 
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S.No. Words IPA transcription Meaning 
11. fans /hisab/ Account, calculation, rate 
12. Jog /honor/ Art, skill, technique 
13. JaH /hakom/ Ruler, governor, officer 
14. dud /hadzor/ Present, ready, available 
1 omer /hia/ Heart, courage, nerve 
16. sted /hitar/ Heater 
17. Is /hura/ Fist, box, buffet 
18. nae /hevan/ Animal, uncivilized person 
19. de /hotfta/ Blunt, flippant, mean 
20. jer /hor/ More, else, further 
21). ae /hod3/ Water tank, masonry tub 
22. Wot /ghar/ Food, diet, meal 
23: Atos /sahit/ Literature, literary art 
24. Af /saheb/ Master, lord, boss 
25; Hote /fahid/ Martyr 
26. nigd /ohor/ Ailment, diseases, malady 
Zs woe /ehad/ Resolve, promise 
28. WIhgIg /ahor/ Impulse, enthusiasm 
29. ASA /sahos/ Courage, boldness, daring 
30. Hn /sohad3/ Grace, beauty, delicacy 
SJ: Hfgdg /fehor/ City, town 
22; Had /fohor/ Husband 
33, Hofer /sohatk/ Assistant, helper, colleague 
34. Fiat /sehad3/ Easy, slow, tranquil 

Table 6 
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Trisyllabic 


S.No. Words IPA transcription Meaning 
p. feedg /vidord/ Rebellion, defiance, revolt 
Rae AfsAS™ /stsuba/ Naturally, spontaneously 
a. fgarfest /hikarti/ Apologal, anecdotal 
4. foH aS /himatfal/ Himachal Pradesh 
5. Jao /holara/ Swing, oscillation, kick 
6. JaEt /hukena/ To raise, utter cry of pain 
fe —— /hesiot/ Status, position, property 
8. SHSHe /hoslamdd/ Courageousness, patience 
9. Wat /ahdkar/ Pride, arrogance 
10. wifjHa /ahisok/ Nonviolent, peaceful 
ll.) feafsog /Tftehar/ Advertisement, poster 
12. fefsgH /Itehas/ History, the past 
13. Afgde /sohtrd/ Good-hearted, kind, gentle 
14. AIS" /sohela/ Comfortable, soothing 
15. Ades /fahadot/ Martyrdom, self-sacrifice 
16.| feHfsge /tmtehan / Examination, test, trial 
V7: fago /sehora/ Chaplet, wreath, honour 
18. Aoet /sohona/ Good looking 
19. Ada /sohaga/ Borax, tincal, leveller 
20. Afsss /fehatut/ Mulburry 
2h: 9 /fahodi/ Evidence, testimony 
22. Hofest /soharta/ Help, support, relief 
23; Hoe /sohai/ Who provides help 
24. Hoda" /soharna/ Bear, suffer, to support 
25. Hog /sohara/ Support, refuge, shelter 

Table 7 
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Polysyllabic 


S.No. Words IPA transcription Meaning 
1.) wfgarenc /shisavad/ Doctrine 
Zs wifgearHt /ehodnama/ Treaty, formal agreement b/w nations & 
states 
BP fagsHe /sehotamdd/ Healthy 
Table 8 
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Chapter 3 — Word list having Conjuncts of /h/ 


Monosyllabic 
S.No. Word IPA Transcription Meaning 
1. Is /g6l/ Fruit of mulberry 
Table 9 
Disyllabic 
S.No. Word IPA Transcription Meaning 
1. Tes /galdr/ Squirrel 
2 we /gomot/ Boil 
3 fABeE /d3ildn/ Mire, bog, mud, marsh 
4. ug /parai/ Education, study, teaching 
5 ufgar /patia/ Read, studied 
6 os /thola/ Fat person 
Table 10 
Trisyllabic 
S.No. Word IPA Transcription Meaning 
I. unyet /k®omoni/ Multicoloured yarn 
De uger /k*grdva/ Rough, rude, impolite 
3. USE /kolana/ To open, become open 
4. deer /k*ollova/ Loose, expansible 
a Aaa /salaba/ Water-logging, seepage 
6. Aor /sindna/ To moisten, make wet 
7. SYSa /tomator/ Someone like you, you 
8. uyeEt /poraona/ To teach, educate, tutor 
Table 11 
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Polysyllabic 


S.No. Word IPA Transcription Meaning 
1. Pe rarsroul /kbolarna/ To stop, to interrupt 
one BIE C tou /gactakena/ To boil, thunder, roar 
Table 12 
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Chapter 4 — Word list of Stress 


Di-syllabic 


S.No. Word IPA transcription 
if Hoge /sorab/ 
2s Had /socak/ 
3. THY /hasab/ 
4. Jad /honot/ 
5. aan /hozom/ 
6. SCICI /ogor/ 
ee Gnd /ozoar/ 
8. @sHe /otsov/ 
9. @s3Ha /otsok/ 
10. HdIe /fagon/ 
11. ana /kasok/ 
|e afdes /gorift/ 
13; UIs /tfogol/ 
14. Bug /omor/ 
15. WEY /anok*/ 
16. Woe /ondd/ 
17. Hae /sdkot/ 
18. Has /mdgol/ 
19. SHS /basat/ 
20. HSdI /maldg/ 
21. far /p"tkka/ 
22: wid} /agge/ 
vay Har /sadza/ 
24. uer /poka/ 
20: ys /bata/ 
26. yer /bola/ 
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S.No. Word IPA transcription 
21 fat /p'idda/ 
28. feet /p'ida/ 
29, Hud /majur/ 
30. a /mona/ 
eA aH /koma/ 
32: HoT /sona/ 
a3: oA /rasa/ 
34. Seu /tona/ 
Bey garg /tfalak/ 
36. fora /gila/ 
Bae GA /jesu/ 
38. aa /jokin/ 
39. dH /rasta/ 
40. forrs /hisab/ 
4]. =r) /badgar/ 
42. Vota /morid3/ 
43. ade /korib/ 
44. T= T0) /odzen/ 
45. WoTs /okal/ 
46. WeTH /ap'im/ 
47. @saz /otat/ 
48. eae /othan/ 
49. WHATS /asan/ 
50. wHtg /amitr/ 
Si WES /dzaban/ 
D2, ast /koni/ 
53: 3At /tosa/ 
54. 3e /tad6/ 
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S.No. Word IPA transcription 
55. ne /3gur/ 
56. re: /gadzor/ 
aye HAS /gokol/ 
58. O30 /nuton/ 
59. Sag /oetar/ 
60. Geo /joven/ 
61. Jae /herat/ 
62. ase /kedot/ 
63. JaH /hakom/ 
64. usd /kator/ 
65. yids /orat/ 
66. Slad /thakor/ 
67. Wray /akot/ 
68. DOE /tfanon/ 
69. yeaa /balok/ 
70. wifsH /atif/ 
Fl. Hae /sdkon/ 
(px sige /tadav/ 
138 YsSs [pec 1/ 
74. uns /pak*dd/ 
TD: om /tfatfa/ 
76. @et /preta/ 
77. pts /dzina/ 
78. HST /mali/ 
79. eg /deo/ 
80. Bg /leo/ 
81. aa /kota/ 
82. ast /phita/ 
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S.No. Word IPA transcription 
83. gost /benti/ 
84. Hau? /motdar/ 
85. HAS /modzud/ 
86. eer /una/ 
87. = /sara/ 
88. aHS /iman/ 
89. ast /rani/ 
90. atesr /kap*la/ 
91. to /khira/ 
oD. aret /gani/ 
O53; Abst /sina/ 
94. 3irgr /taba/ 
95. ater /kata/ 
96. git /tfadi/ 
97. ust /péti/ 
98. aay /vedat/ 
99. Hatt /mitgi/ 

100. et /mtid3i/ 

101. aa /kbtdi/ 

. Table 13 
Tri-syllabic 

S.No. Words IPA transcription 
ik WagHa /akormok/ 
2 wfyst /omita/ 
3 WHES /asoptal/ 
4. SHAS /befakol/ 
5 Sfhea /besidok/ 
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S.No. Words IPA transcription 
6. fesHot /tfilmotfi/ 
7. fesdar /tfilgoza/ 
8. Bene /davadol/ 
9. didd'nd /gerhadzor/ 
10. feasg /tkattor/ 
11. rarest /dgagona/ 
12, ASST /dgalalat/ 
13. Weed /dgalodar/ 
14. Hya<cs /dzobonvat/ 
15; asufegq /kalpontk/ 
16. aHHaH /kofmokaf/ 
ry, yAfeet /kbofdali/ 
18. agus /korapon/ 
19. SISS= /niletton/ 
20. uTgegHa /pardorfak/ 
Zk USUS /polapan/ 
Ze gusd /rupator/ 
23, Fane /suratmdd/ 
24. say /tdbaku/ 
2a ae /tapoben/ 
26. 3s3aqgr /tatkora/ 
27. Stara /tikakar/ 
28. feast /tikona/ 
29. @rd3 /odgorat/ 
30. @ucet /oltana/ 

Table 14 
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Poly-syllabic 


S.No word IPA transcription 
1. wifes /okirtafil/ 
2 Hchiteg /s€timitor/ 
3 ostzet /tfetavoni/ 
4 dndeor /godzravala/ 
5 feaeanys /ikvasapon/ 
6 yer /k*atfevala/ 
7 Sarset /nakabddi/ 
8 Uesterar /pak*"rivala/ 
9 JUsIS /rupatoron/ 
10 mies /samdtvadi/ 
11 AHO Het /samrad3vadi/ 
12 HeS3d6 /sthandtoron/ 
13 Caauet /ogorpat"i/ 
14 @nsust /odzaddpona/ 

Table 15 
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Appendix B— DATA SHEETS 


Chapter 3 — Data sheets of Tonemes 


Sample Data sheet 1 (Composite word): Ways /kallukara/ 


eS 
Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU(HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Ml 199 628 209 | 203 | 196 | 188 HL 0.04 
M2 248 348 254 | 248 | 246 | 243 HL 0.07 
M3 256 321 263 | 259 | 256 | 248 HL 0.10 
M4 224 1135 249 | 229 | 218 | 205 HL 0.06 
Average 232 608 244 | 235 | 229 | 221 HL 0.07 
Table 1: Data of Male Speakers 
Female Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Fl 295 450 307 | 292 | 290 | 289 HL 0.07 
F2 IP IP IP IP IP IP IP IP 
F3 323 402 331 | 325 | 322 | 316 HL 0.06 
F4 280 778 301 | 282 | 272 | 264 HL 0.07 
F5 IP IP IP IP IP IP IP IP 
F6 309 371 316 | 309 | 309 | 305 HL 0.07 
Average 302 500 314 | 302 | 298 | 294 HL 0.07 


Table 2: Data of Female Speakers 
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wT 


Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) (HZ/Sec) TBU of tone of TBU 
(HZ/Sec) 
25% | 25% | 25% | 25% 
MI 179 149 183 | 180 | 179 176 HL 0.13 
M2 256 125 258 | 258 | 256 252 HL 0.12 
M3 243 293 257 | 245 | 239 231 HL 0.19 
M4 214 297 230 | 215 | 207 204 HL 0.15 
Average 223 216 232 | 225 | 220 216 HL 0.15 
Table 3: Data of Male Speakers 
Female Fy in Slope in Cross-Sectional Slope of Conto | Duration 
Speakers | (HZ/Sec) (HZ/Sec) TBU ur of of TBU 
(HZ/Sec) tone 
25% | 25% | 25% | 25% 

Fl 325 315 325 | 315 | 297 | 278 HL 0.21 
F2 IP IP IP IP IP IP IP IP 
F3 335 328 335 | 328 | 321 304 HL 0.15 
F4 292 278 292 | 278 | 275 | 275 HL 0.14 
F5 IP IP IP IP IP IP IP IP 
F6 301 298 301 298 | 298 | 290 HL 0.16 
Average 313 305 313 | 305 | 298 | 287 HL 0.17 


Table 4: Data of Female Speakers 
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Sample Data sheet 2: fS 3S" /ridgd3dna / 


Male Fy in Slope in | Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU(HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
M1 191 324 177 | 191 | 196 | 201 LH 0.11 
M2 266 156 263 | 266 | 267 | 269 LH 0.10 
M3 232 296 224 | 230 | 236 | 239 LH 0.09 
M4 217 403 207 | 218 | 221 | 222 LH 0.06 
Average 227 266 219 | 226 | 230 | 232 LH 0.09 
Table 5: Data of Male Speakers 
Female Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Fl 289 419 278 289 293 296 LH 0.06 
F2 301 850 282 | 301 | 311 | 313 LH 0.05 
F3 314 433 298 313 320 325 LH 0.10 
F4 345 474 329 346 352 353 LH 0.09 
F5 346 346 333 | 344 | 350 | 354 LH 0.08 
F6 309 428 297 | 309 | 314 | 317 LH 0.09 
Average 317 492 303 | 317 | 323 | 326 LH 0.08 


Table 6: Data of Female Speakers 
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Sample Data sheet 3 (Dipthong): Zt /{ai/ 


Male Fy in Slope in | Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU(HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
MI 173 229 182 | 160 | 175 | 175 HLH 0.46 
M2 261 150 267 | 261 | 258 | 258 HL 0.35 
M3 242 155 255 | 230 | 242 | 242 HLH 0.51 
M4 201 187 211 | 192 | 192 | 206 HLH 0.34 
Average 205 190 216 | 194 | 203 | 208 HLH 0.44 
Table 7: Data of Male Speakers 
Female Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Fl 308 462 322 | 279 | 294 | 337 HLH 0.39 
F2 332 362 343 | 321 | 318 | 345 HLH 0.33 
F3 333 312 345 | 318 | 319 | 351 HLH 0.44 
F4 309 192 329 | 302 | 295 | 308 HLH 0.45 
F5 365 128 374 | 365 | 361 | 361 HLH 0.44 
F6 319 300 353 | 315 | 294 | 312 HLH 0.43 
Average 328 293 344 | 317 | 314 | 336 HLH 0.41 


Table 8: Data of Female Speakers 
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Sample Data sheet 4: UdH /taram/ 


Male Fy in Slope in | Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU(HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Ml 187 627 207 | 196 | 182 | 165 HL 0.08 
M2 272 274 280 | 272 | 270 | 266 HL 0.11 
M3 262 308 265 | 264 | 263 | 256 HL 0.15 
M4 218 184 223 | 220 | 216 | 212 HL 0.10 
Average 235 348 244 | 238 | 233 | 225 HL 0.11 
Table 9: Data of Male speakers 
Female Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Fl 297 327 307 | 299 | 297 | 287 HL 0.13 
F2 321 410 331 | 322 | 322 | 311 HL 0.13 
F3 317 379 328 | 325 | 316 | 299 HL 0.12 
F4 278 519 300 | 283 | 272 | 256 HL 0.14 
F5 343 299 349 | 349 | 344 | 330 HL 0.16 
F6 293 611 307 | 307 | 291 | 268 HL 0.17 
Average 308 424 320 | 314 | 307 | 292 HL 0.14 


Table 10: Data of Female speakers 
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Sample Data sheet 5: fared /priftatfar/ 


Male FO Slopein | Cross-Sectional Slope of | Contour | Duration 
Speakers | in(HZ/Sec) | (HZ/Sec) TBU(HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
MI 177 612 196 | 185 | 170 | 156 HL 0.08 
M2 270 249 275 | 272 | 270 | 263 HL 0.08 
M3 236 544 238 | 247 | 239 | 221 HL 0.11 
M4 214 926 229 | 225 | 211 | 194 HL 0.05 
Average 224 583 235 | 232 | 223 | 209 HL 0.08 
Table 11: Data of Male speakers 
Female Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Fl 302 928 296 | 312 | 309 | 292 HL 0.07 
F2 341 707 341 | 348 | 346 | 328 HL 0.06 
F3 312 629 334 | 319 | 302 | 294 HL 0.07 
F4 281 820 296 | 295 | 282 | 251 HL 0.08 
F5 IP 
F6 338 736 342 | 350 | 342 | 319 HL 0.12 
Average 315 764 322 | 325 | 316 | 297 HL 0.08 


Table 12: Data of Female speakers 


306 


Chapter 3 — Data sheets of Consonant /h/ 


Sample Data sheet 1: Yd /k*u/ 


Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
M1 181 251 154 | 192 | 198 | 181 LHL 0.43 
M2 258 202 239 | 265 | 267 | 263 LHL 0.43 
M3 235 200 234 | 245 | 236 | 225 LHL 0.53 
M4 203 271 178 | 201 | 219 | 215 LHL 0.33 
Average 219 231 201 | 226 | 230 | 221 LHL 0.43 
Table 13: Data of Male speakers 

Female Fy in Slope in Cross-Sectional Slope of | Contour | Duration 

Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% |25 | 25% | 25% 
% 
Fl 307 317 287 | 291 | 316 | 332 LH 0.35 
F2 343 471 319 | 331 | 363 | 357 LHL 0.33 
F3 295 875 291 | 299 | 333 | 252 LHL 0.38 
F4 324 347 285 | 321 | 348 | 340 LHL 0.31 
F5 282 713 347 | 372 | 214 | 197 LHL 0.38 
F6 310 525 274 | 288 | 325 | 356 LH 0.24 
Average 310 541 301 | 317 | 317 | 306 LHL + 0.33 
LH 
Table 14: Data of Female speakers 
Sample Data sheet 2: 6Td /{a/ 
Male Fyin Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 

Ml 159 232 157 | 139 | 165 | 175 HLH 0.48 
M2 230 293 231 | 221 | 230 | 237 HLH 0.17 
M3 217 229 243 | 224 | 224 | 217 HL 0.43 
M4 188 245 188 | 171 | 191 | 202 HLH 0.35 
Average 199 250 205 | 189 | 203 | 208 HLH + 0.36 

HL 


Table 15: Data of Male speakers 
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Female Fy in Slope in Cross-Sectional Slope of Contour | Duration 

Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% 25% | 25% | 25% 
Fl 271 454 281 252 | 254 | 295 HLH 0.39 
F2 261 327 274 261 | 247 | 262 HLH 0.37 
F3 250 536 296 280 | 227 195 HL 0.30 
F4 298 382 314 279 | 294 | 304 HLH 0.38 
F5 346 192 358 331 340 | 354 HLH 0.42 
F6 270 390 275 278 | 255 | 272 HLH 0.30 
Average 283 380 300 280 | 270 | 280 HLH + 0.36 
HL 
Table 16: Data of Female speakers 
Sample Data sheet 3: CHT /vasa/ 
Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Ml 166 233 138 | 163 | 185 | 179 LHL 0.37 
M2 239 198 220 | 234 | 248 | 257 LH 0.39 
M3 223 206 218 | 227 | 225 | 224 LHL 0.44 
M4 NT 
Average 209 212 192 | 208 | 219 | 220 LHL + 0.40 
LH 
Table 17: Data of Male speakers 

Female Fy in Slope in Cross-Sectional Slope of Contou | Duration 

Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) r of of TBU 
25% | 25% | 25% | 25% tone 

Fl 266 318 246 | 251 | 271 296 LH 0.30 
F2 305 392 283 | 283 | 309 | 345 LH 0.27 
F3 283 465 255 | 256 | 286 | 336 LH 0.38 
F4 275 435 269 | 277 | 289 | 265 LH 0.19 
F5 339 230 317 _ | 328 | 349 | 361 LH 0.34 
F6 257 232 251 251 | 261 266 LH 0.23 
Average 288 345 270 | 274 | 294 | 312 LH 0.29 


Table 18: Data of Female speakers 
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Sample Data sheet 4: fEdT /vidoré/ 


Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 

Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
MI 181 194 152 | 180 | 199 | 191 LHL 0.47 
M2 246 178 227 | 245 | 255 | 256 LH 0.37 
M3 231 183 223 | 238 | 236 | 228 LHL 0.35 
M4 IP 
Average 219 185 201 | 221 | 230 | 225 LHL + 0.40 
LH 
Table 19: Data of Male speakers 
Female Fy in Slope in Cross-Sectional Slope of Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Fl 281 337 253 | 267 | 292 | 312 LH 0.27 
F2 303 422 285 | 290 | 310 | 328 LH 0.22 
F3 Non- Tonal 
F4 301 284 277 | 289 | 315 | 324 LH 0.33 
F5 339 232 311 338 | 354 | 353 LHL 0.30 
F6 298 978 279 | 282 | 319 | 313 LHL 0.18 
Average 304 451 281 | 293 | 318 | 326 LHL + 0.26 
LH 


Table 20: Data of Female speakers 
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Chapter 3 — Data sheets of Conjuncts of /fi/ 


Sample Data sheet 1: TS /g6l/ 


Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
MI 169 262 140 | 161 | 180 | 195 LH 0.06 
M2 251 170 230 | 247 | 261 | 267 LH 0.04 
M3 223 259 195 | 217 | 236 | 245 LH 0.12 
M4 186 274 158 | 174 | 199 | 214 LH 0.11 
Average 207 241 181 |; 200 | 219 | 230 LH 0.08 
Table 21: Data of Male speakers 
Female Fy in Slope in Cross-Sectional Slope of Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Fl 260 246 249 | 250 | 262 | 279 LH 0.24 
F2 323 259 314 | 314 | 321 343 LH 0.32 
F3 303 277 278 | 295 | 312 | 326 LH 0.30 
F4 282 147 261 277 | 292 | 297 LH 0.43 
F5 329 186 319 | 326 | 332 |) 341 LH 0.25 
F6 232 143 226 | 230 | 232 | 240 LH 0.31 
Average 288 210 275 | 282 | 292 | 304 LH 0.31 


Table 22: Data of Female speakers 
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Sample Data sheet 2: H&™ST /salaba/ 


Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 

Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 

Ml 175 415 187 | 181 | 173 | 158 HL 0.11 

M2 272 226 282 | 277 | 269 | 261 HL 0.14 

M3 235 268 243 | 242 | 235 | 219 HL 0.15 

M4 195 423 208 | 201 | 192 | 179 HL 0.11 

Average 219 333 230 | 225 | 217 | 204 HL 0.13 

Table 23: Data of Male speakers 

Female Fy in Slope in Cross-Sectional Slope of Contour | Duration 

Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 

Fl 276 411 291 | 286] 273 254 HL 0.13 

F2 304 381 319 | 311 300 286 HL 0.12 

F3 317 519 340 | 328) 311 288 HL 0.13 

F4 253 353 271 | 258 | 248 235 HL 0.15 

F5 334 459 345 | 343 | 334 312 HL 0.12 

F6 283 380 294 | 292] 285 262 HL 0.13 

Average 295 417 310 | 303 | 292 273 HL 0.13 

Table 24: Data of Female speakers 
Sample Data sheet 3: H3™JIAT /ktalarna/ 
Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 

Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 

Ml 145 58 148 | 147 | 145 | 143 HL 0.18 

M2 241 75 242 | 242 | 242 | 240 HL 0.18 

M3 IP 

M4 220 237 220 | 203 | 185 | 178 HL 0.22 

Average 203 123 203 | 197 | 191 | 187 HL 0.19 


Table 25: Data of Male speakers 
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Female Fyin | Slope in Cross-Sectional Slope of Contour Duration of 
Speakers | (HZ/S | (HZ/Sec TBU (HZ/Sec) of tone TBU 
ec) ) 25% | 25% | 25% | 25% 
Fl 270 342 289 | 282 | 264 | 243 HL 0.17 
F2 293 171 303 | 293 | 290 | 289 HL 0.19 
F3 285 153 295 | 288 | 282 | 278 HL 0.17 
F4 228 313 240 | 226 | 222 | 219 HL 0.21 
F5 287 192 291 | 289 | 285 | 283 HL 0.20 
F6 218 121 226 | 220 | 215 | 212 HL 0.23 
Average 264 215 274 | 266 | 260 | 254 HL 0.20 
Table 26: Data of Female speakers 
Sample Data sheet 4: SZ /galsy/ 
Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
MI 204 264 202 | 204 | 206 | 202 LHL 0.09 
M2 282 143 282 | 286 | 283 | 279 LHL 0.11 
M3 
M4 212 101 212 | 214 | 212 | 211 LHL 0.09 
Average 233 169 232 | 235 | 234 | 231 LHL 0.10 
Table 27: Data of Male speakers 
Female Fy in Slope in Cross-Sectional Slope of Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 
Fl 304 554 282 | 297 | 309 | 327 LH 0.11 
F2 344 560 326 | 338 | 347 | 365 LH 0.11 
F3 355 135 355 | 358 | 357 | 351 LHL 0.16 
F4 292 158 284 | 290 | 294 | 298 LH 0.13 
F5 341 260 331 335 | 344 | 356 LH 0.12 
F6 284 283 277 | 281 | 288 | 292 LH 0.14 
Average 320 325 309 | 317 | 323 | 332 LH 0.13 
+LHL 


Table 28: Data of Female speakers 


312 


Sample Data sheet 5: 63° /{hula/ 


Male Fy in Slope in Cross-Sectional Slope of | Contour | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) of tone of TBU 
25% | 25% | 25% | 25% 

Ml 197 179 196 | 198 | 197 | 196 LHL 0.20 
M2 IP 

M3 IP 

M4 215 67 215 | 217 | 216 | 213 LHL 0.24 
Average 206 123 206 | 208 | 207 | 205 LHL 0.22 

Table 29: Data of Male speakers 
Female Fy in Slope in Cross-Sectional Slope of Contou | Duration 
Speakers | (HZ/Sec) | (HZ/Sec) TBU (HZ/Sec) r of of TBU 
25% | 25% | 25% | 25% tone 
Fl 322 140 310 | 321 328 | 329 LH 0.21 
F2 309 129 302 | 304 | 312 | 320 LH 0.23 
F3 340 267 326 | 332 | 346 | 359 LH 0.21 
F4 287 188 280 | 289 | 291 | 288 LHL 0.14 
F5 351 112 351 353 | 350 | 350 LHL 0.27 
F6 325 253 309 | 323 | 335 | 335 LH 0.14 
Average 322 182 313 | 320 | 327 | 330 | LHL+ 0.20 
LH 


Table 30: Data of Female speakers 
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Appendix C- GRAPHS 


Chapter 3- Graphs of Tonemes 


Fig 3: Male formant sample Fig 4: Female formant sample 
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2. fares /tian/ 


Fig 7: Male formant sample Fig 8: Female formant sample 
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3. DUSUE /tfodsrpona/ 


Fig 9: Male pitch sample Fig 10: Female pitch sample 


Fig 11: Male formant sample Fig 12: Female formant sample 
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Fig 15: Male formant sample Fig 16: Female formant sample 
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5. fos@e" /nibauna/ 


Fig 17: Male pitch sample Fig 18: Female pitch sample 


Fig 19: Male formant sample Fig 20: Female formant sample 
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6. PSE /tuddna/ 


=ray ETE! 


Fig 21: Male pitch sample 


Fig 23: Male formant sample 


Fig 24: Female formant sample 


7. SOTHO /namtari/ 


Fig 25: Male pitch sample Fig 26: Female pitch sample 


Fig 27: Male formant sample Fig 28: Female formant sample 
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8. GxeY /bdzbbg/ 


Fig 29: Male pitch sample Fig 30: Female pitch sample 


Fig 31: Male formant sample Fig 32: Female formant sample 
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9, Tam /dodia/ 


Fig 33: Male pitch sample Fig 34: Female pitch sample 


———— 


Fig 35: Male formant sample Fig 36: Female formant sample 
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10. H8™ /siédza/ 


Fig 37: Male pitch sample Fig 38: Female pitch sample 


Fig 39: Male formant sample Fig 40: Female formant sample 
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11. HSE /sudzai/ 


T= 


Fig 43: Male formant sample Fig 44: Female formant sample 


12. fOxYTH /nigis/ 


Fig 45: Male pitch sample Fig 46: Female pitch sample 
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Fig 47: Male formant sample Fig 48: Female formant sample 


13. fours /tfigdy/ 


i 


Fig 51: Male formant sample Fig 52: Female formant sample 
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14. UO'S /prodan/ 


Fig 55: Male formant sample Fig 56: Female formant sample 
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Chapter 3- Graphs of Consonent /h/ 


1. Gd /va/ 


nas 


Deng PTT TTT | - 


Fig 57: Male pitch sample Fig 58: Female pitch sample 


abos~t eS 


fre tat Quy Ver Seimt bees eondey "ee Sorters PES letermy formant Pace 


— 0.98572 a s 0556208 
Mmm 
ee 


Fig 59: Male formant sample Fig 60: Female formant sample 
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2. Afa /s2/ 


Fig 63: Male formant sample Fig 64: Female formant sample 


Fig 65: Male pitch sample Fig 66: Female pitch sample 
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[hie tae Quy View Sted teens Snovivy Ter tnecnre Ach bevraty lemeew Peter 
4 


Fig 67: Male formant sample Fig 68: Female formant sample 


4. 3d /tara/ 


aaiewaen 


| Fle tat Query View Select Inteval Boundary Ter Spectum Pitch Intenty Forment Pubes 
cl 


A oe ed ap 


Fig 69: Male pitch sample Fig 70: F emale pitch sample 


Fig 71: Male formant sample Fig 72: Female formant sample 
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Chapter 3- Graphs of Conjuncts of /f/ 


Fig 75: Male formant sample Fig 76: Female formant sample 


2. YB /krollsva/ 


Fig 77: Male pitch sample Fig 78: Female pitch sample 
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Fig 79: Male formant sample 


3. THOS" /sindna/ 


Fig 83: Male formant sample . a Fig 84: Female formant sample 
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4. user /poyaona/ 


_——— 


tee ta Guy vee teez bees teweey Ter leecon PED bere fomee Pot 


Fig 86: Female pitch sample 


Tm tantns paras — = 
fre bet Qomy we Sect bemat frumtery far ipecre Sih bamty foment Syme 


Fig 87: Male formant sample 


5. ufsor /poria/ 


©, TextGrid padeye. 


Fig 89: Male pitch sample Fig 90: Female pitch sample 
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\@hYemtnepeme 


ee re ene nea 
: 


ip-Puf af whe 6 fm 


Fig 93: Male pitch sample 


oa 
Fig 95: Male formant sample 
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Fig 92: Female formant sample 


Fig 96: Female formant sample 


Chapter 3- MATLAB code for plotting Independent tone graphs 


[x,fs]=wavread('C:\Users\tempS\Desktop\Matlab\chidak.wav’); 
y=x(:,1); 

[fx,tt]=fxrapt(y,fs,'u'); 

subplot(2,1,1),plot(y) 

hold on 

subplot(2,1,2),plot(fx) 
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Chapter 4 — Experimental Study of Lexical Stress 


Appendix D — ANNEXURES 


Annexure I 
Di-syllabic non-tonal words 
S. No. Word Syllabic weight Intra-syllabic 
s1 s2 stress (s;_in %) 
1. HAS /sorob/ 58.78 59.59 1.38 
2s Aaa /satak / 59.43 60.5 1.80 
2: JAS / hasab / 58.34 61.43 5.30 
4, Hale / fagon / 59.77 61.61 3.08 
my OHA / kosok / 58.97 60.57 al 
6. EX / onakh/ 60.88 61.64 129 
fe Jd /honor/ 63.5 65.8 3.62 
8. | Gad /ugor/ 64.31 65.47 1.80 
D @xd /uzor/ 59.55 59.98 O72 
10. | aS Afogel/ 61.29 61.17 -0.20 
11. | wfSe /ondd/ 59.66 60.53 1.46 
12. | Hae /sdkot/ 60.2 61.56 2.26 
13. | niet /ogge/ 59.23 60.97 2.94 
14. | FAT /sodga/ 60.33 62.36 3.36 
15. | Yat /poka/ 60.04 61.35 2.18 
16. | S3T /bata/ 56.37 60.05 6.03 
17. | SS /bola/ 56.28 60.19 6.95 
18. | &3" /pidda/ 61.44 59.2 -3.65 
19. | fe" /p'da/ 61.12 61.54 0.69 
20. | HO" /mona/ 59.6 62.4 4.70 
21. | SH /koma/ 59.82 60.32 0.84 
22. | Hat /sona/ 61.41 63.21 2.93 
23. | DAT /rasa/ 58.15 62.54 [Fs 


Contd... 
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S. No. Word Syllabic weight Intra-syllabic 
sl s2 stress (s;_in %) 
24. | SE /tona/ 59.04 61.47 4.12 
25. | foTS" /gila/ 56.51 61.08 8.09 
26. | GH /yosu/ 60.26 65.86 9.29 
27. | SAS /rasta/ 58.15 63.36 8.96 
28. | SAT /tusa / 62.25 64.39 3.44 
29. | Se / todd/ 60.08 61.98 3.16 
30. | @Hd /omor / 62:77 66.15 3.73 
31. | ASt/ koni/ 63.46 68.9 8.57 
32. | Hat /mdgal / 61:63 67.67 9.80 
33. | SHS / bosat/ 57.26 63.85 el 
34. | HSaT /moldg / 60.99 64.12 5.13 
35. | @SHE/ vtsov/ 59.41 65.37 10.03 
36. | @3Ha /utsuk/ 60.59 66.22 9.29 
37. | Hass /gorrpt/ 58.38 62.63 7.28 
38. | SHA /gadzor/ 57.53 60.62 DOr 
39. | HAS /gokul/ 58.26 62.83 7.84 
40. | 636 /nuton/ 59.57 62.97 wil 
41. | Sea /votor/ 57.82 61.17 a19 
42. | Wes /yovon/ 59.76 62.83 5.14 
43. | daz /herot/ 60.75 62.89 Die 
44. | A3S /kedot/ 59.82 60 0.30 
45. | aH /hakom/ 59.38 65.95 11.06 
46. | A&E /sdkon/ 62.01 64.62 4.21 
47. | 3tS /tadov/ 59.33 63.06 6.29 
48. | USS /pec |/ 59.9 63.49 5.99 
49. | wigZ /orot/ 60.54 64.04 5.78 
50. | ara /tfatfa/ 58.8 61.82 5.14 
SI. | Ser /preta/ 61.28 62.5 1.99 
52. | HTS /mali/ 59.8 61.54 2.91 
53. | Az /kora/ 64.5 68.3 5.89 
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S. No. Word Syllabic weight Intra-syllabic 
s1 s2 stress (s; in %) 
54. | ats /prita/ 63.3 66.96 5.78 
55. | StS /taba/ 58.74 61.42 4.56 
56. | ater /kata/ 58.52 63.27 8.12 
ST. | ret Afadi/ 60.21 62.17 3.26 
58. | YSt /pati/ 59 63 6.78 
59. | dail /miigi/ 59.63 62.48 4.78 
60. | Sta /thakor/ 61.2 64.87 6.00 
61. | wd /khator/ 60.5 63.65 5.21 
62. | wireatz /akor/ 60.63 64.75 6.80 
63. | BOSE /tfanon/ 64.77 66.75 3.06 
64. | ums /pakSd/ 64.41 63.71 -1.09 
65. | STBat /balok/ 59.39 63.26 6.52 
66. | ATS /sara/ 59.72 64.03 Tae 
67. | Hit /midgi/ 62.85 65.3 3.90 
68. | get /rani/ 61.37 66.79 8.83 
69. | ATS" /kapla 61.68 66.07 dike 
70. | het / dgina/ 58.23 64.87 11.40 
TL. | fat /khira/ 63.02 65.37 3.73 
72. | ast /gani/ 63.38 67.27 6.14 
73. | ALET /sina/ 61.07 65.69 oe 
74. | Gat /koSdi/ 64.3 66.41 3.28 
73. | HOS /mojur/ 62.01 63.15 1.84 
76. | Beret /tfolak/ 61.83 60.23 -2.59 
77. | Was /jakin/ 59.35 63.38 6.79 
78. | fSATS /hisab/ 62.19 62.43 0.39 
79. | Sard /bozar/ 58.11 59.56 2.50 
80. | Hota /morids/ 63.13 62.56 -0.90 
81. | Sate /korib/ 61.89 61.56 -0.53 
82. | GAS /udgen/ 59.62 60.16 0.91 
83. | WaT /akal/ 59.4 62.01 499 
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S. No. Word Syllabic weight Intra-syllabic 
sl s2 stress (s; in %) 
84. wfetH /aphim/ 60.52 64.71 6.92 
85. | @3TS /vutar/ 60.75 63.11 3.88 
86. | Gare /uttan/ 61.58 63.29 2.78 
87. | WATS /osan/ 60.6 63.27 4.41 
88. | wpttd /omir/ 61.39 65.56 6.79 
89. | FBS /dgoban/ 59.84 63.97 6.90 
90. | wig /Sgur/ 62.78 65.36 4.11 
91. | Hete/modgud/ 59.8 59.36 -0.74 
92. | Set3/vedat/ 59.25 60.38 1.91 
93. | ShHro/iman/ 66.57 67.39 123 
94. | JesvVbenti/ So AE 61.75 8.12 
95. | HFed/mordar/ 60.02 62.43 4.02 
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Annexure II 


Di-syllabic Supra-Laryngeal tonal words 


S. Word Syllabic weight Intra-syllabic 
No. stress (s; in %) 
sl s2 
1 BS (/tfut"/) 68.47 58.63 -14.36 
2 WI (/kar/) 66.64 57.57 -13.60 
3 fiat (/kiggi/) 68.94 60.92 -11.63 
4 f¥zq (/tfirak/) 70.20 63.04 -10.20 
5 oat (/t51i /) 72.25 65.41 -9.46 
6 Det (/tobi/) 69.67 63.29 -9.15 
7 wat (/tadi/) 68.07 62.94 -7.53 
8 Ccict (/pebit/) 66.94 62.23 -7.04 
9 ZTE" (/tshma/) 68.39 63.61 -6.99 
10 we (/taba/) 67.74 63.00 -6.99 
11 Wer (/kdm /) 70.08 65.39 -6.70 
12 wrest (/kahi/) 65.78 61.38 -6.69 
13 Sd (/tfage/) 67.76 63.77 -5.88 
14 ES (/tilla/) 70.43 66.31 -5.85 
15 wat (/kad /) 67.06 63.22 -5.73 
16 dH (/tdcom/) 69.67 66.09 -5.13 
17 Wet /kota/ 67.74 64.33 -5.05 
18 SAH (/pasom/) 70.01 66.47 -5.05 
19 ulg /kéca/ 66.07 62.76 -5.01 
20 vat (/tfoli/) 68.71 65.40 -4.31 
21 UST /kAli/ 68.06 65.34 -3.99 
22 BT (/tfuta/) 66.34 63.90 -3.68 
23 UAE (/tdkkon/) 68.76 66.27 -3.63 
24 BiH (/tfad or/) 66.01 63.74 -3.43 
25 wast (/kid /) 67.65 65.60 -3.04 


341 


S. No. Word Syllabic weight Intra-syllabic 
stress (s; in %) 
sl s2 

26 BZ (/tfadu/) 67.23 65.28 -2.90 
27 Sait (/pdgi/) 69.44 67.65 -2.57 
28 OOH (/tano{/) 70.15 68.41 -2.49 
29 UST (/koda/) 65.35 64.27 -1.65 
30 WIS (/kticana/) 66.31 65.22 -1.64 
31 BS" /tfSda/ 66.03 65.07 -1.46 
32 Be (/psdw/) 68.96 68.05 -1.31 
33 Ware (/tonad/) 65.96 65.53 -0.64 
34 fas (/tfigat/) 66.58 66.84 0.38 
35 aeet (/kadai/) 66.85 67.18 0.50 
36 ate (/kSdui/) 68.95 69.71 1.11 
37 ASEH (/sud3ai/) 65.73 66.49 1.15 
38 AB™ (/sudzao/) 66.23 66.99 15 
39 ass 64.34 65.65 2.04 
40 fSwurA (/nigas/) 65.49 67.56 3.16 
41 Yas (/pradan/) 68.56 70.98 3,52 
42 Att (/sddi/) 65.11 67.52 3.70 
43 vet (/t/tidi/) 70.19 73.04 4.06 
a4 TEE (/hddau/) 63.15 65.91 4.37 
45 foHfSH (/ermd3im/) 65.43 68.81 5.16 
46 ast (/nabi/) 62.15 65.43 5.29 
47 est (/toba/) 65.12 68.63 5.39 
48 WeRUs (onkdp) 62.84 66.34 5.57 
a9 WIFE (/od3d3dk/) 62.87 66.84 6.30 
50 nifA (abias) 62.53 66.70 6.67 
51 FTES (/sidl/) 65.64 70.09 6.78 
52 HU (/modir/) 63.35 68.53 8.18 
53 fSas (/n1cbé/) 63.07 68.23 8.19 
54 uo (/ktada/) 59.90 64.85 8.26 
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S. Word Syllabic weight Intra-syllabic 
No. stress (s; in %) 
sl s2 

55 W Ss /ducldtb / 63.12 69.30 9.79 

56 Tas (/dorba/) 63.47 70.15 10.53 
57 usd 60.44 67.51 11.70 
58 TT (/goda/) 60.85 68.08 11.88 
59 nO (/adda/) 57.53 64.54 12.19 
60 To (/gudda/) 59.95 67.65 12.83 
61 IME (/guad 1/) 62.04 70.10 12.99 
62 Mes /bid'1/ 62.10 70.85 14.09 
63 eur (/dudia/) 59.53 68.02 14.26 
64 Bs (/budda/) 58.70 67.08 14.29 
65 IO (/gidda/) 58.03 66.41 14.45 
66 fEU¢ (/1di dr/) 59.53 71.70 20.45 
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Di-syllabic laryngeal words 


Annexure III 


S. No. Word Stress (s1) Stress (s2) Yage increase of 
stress (sj) 

1. AT fea /sohark/ 64.62 62.63 -3.07 
2. | Hee /sohai/ 64.89 62.92 -3.04 
3. | URE /porai/ 68.4 67.23 “1.7 
4. | Afar /sehad3/ 62.76 62.8 0.07 
S. | $e" /hevan/ 61.58 61.89 0.5 

6. | Arfgst /saheb/ 62.52 62.95 0.69 
7. | Ser /hotfra/ 63.57 64.26 1.07 
8. | Hag /fohor/ 64.53 65.23 1.09 
9. | TAS /hadgor/ | 6 64 61.58 1.55 
10. JT /hura/ 63 64.19 1.89 
11. | wife /ehod/ 62.34 63.57 1.98 
12. | wrod /ahor/ 63 64.65 2.61 
13. | wfod /ohar/ 63.55 65.85 3.62 
14. | Sard /toba/ 61.73 64.39 4.31 
15. | 3H /tora/ 64.16 67.15 4.66 
16. | Sted hhitor/ et 65.15 6.63 
17. | ufgor fp tial | Gag 69.16 7.39 
18. | OB /thola/ 63.64 69.31 8.91 
19. | AS \vasa/ 58.9 64.84 10.8 
20. | fedat /éna/ 60.81 68.02 11.87 
21. | ner /Ala/ 54.61 61.31 12.28 
22. | Iz /galsy/ 60.69 68.6 13.04 
23. | THF /gomédr/ 60.89 69.27 13.75 
24. | FASE /dgildn/ 58.8 67.5 14.8 
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Tri-syllabic non-tonal words 


Annexure IV 


S. No. Word Stress Stress Stress $3-s2/s2*100 
(s1) (s2) (s3) (Yoage increase of 
stress (s;) 

l. | Sfrea 60.60 62.31 61.39 -1.48 
/bestdok/ 

2. | (HAHAH 60.90 61.80 61.16 -1.04 
/kafmokaf/ 

3. | Sea 56.74 58.13 57.54 -1.01 
/tdbaku/ 

4. | dane 59.21 62.27 62.43 0.26 
/korapon/ 

5. | s3qqr 58.59 62.73 63.56 1.33 
/tatkora/ 

6. | gretau 54.82 59.76 60.75 1.65 
/davadol/ 

7. | Fass 55.87 58.32 59.31 1.69 
/dzalalot/ 

8. | gedaa 57.28 59.10 60.33 2.08 
/gerhadzor/ 

9. | Serarg 58.96 62.03 63.78 2.82 
/tikakar/ 

10.) fasuHot 62.27 62.64 64.51 2.98 
/tfilmotfi/ 

11.) Seas 55.97 57.34 59.05 2.99 
/befakal/ 

12.| Agads 56.12 60.06 62.00 3.24 
/dzobonvat/ 

13.| @geer 59.18 62.71 65.00 3.66 
/oltona/ 

14. | feersr 60.17 62.03 64.48 3.95 
/tikona/ 

15.| zuES 58.19 59.70 62.08 3.98 
/tapobon/ 

16.) neargua 54.69 59.99 62.84 4.75 
/akormok/ 
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S. No. Word Stress Stress Stress $3-s2/s2*100 
(s1) (s2) (s3) Yage increase of 
stress (s;) 

17.| Fg sHe 60.92 60.98 63.88 4.75 
/surotmdd/ 

18.) fosdtar 59.51 56.84 59.71 5.05 
/tfilgoza/ 

19.) ateufeq 58.67 58.94 62.02 5.24 
/kalponik/ 

20. | Denys 62.17 59.44 62.62 5.36 
/polapon/ 

21.) S63 3= 58.30 57.70 60.91 5.56 
/niletton/ 

22.| urgedgHa | 58.77 56.08 59.42 5.96 
/pardorfok/ 

23.| qaed 56.62 57.71 61.78 7.05 
/dzalodor/ 

24. | urfisr 56.52 59.18 63.63 7.53 
/amita/ 

25.| frasg 56.09 59.67 64.49 8.08 
/tkattor/ 

26.| guisd 56.99 58.06 63.02 8.54 
/rupator/ 

27.| wHEeS 56.84 59.24 64.45 8.78 
/asap*al/ 

28.| @Hds 56.48 56.59 61.56 8.79 
/odgzorat/ 

29.) rears" 54.65 55.44 61.14 10.26 
/dzagona/ 

30.) uAfest 65.13 61.95 68.68 10.85 
/kbo fdili/ 
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Tri-syllabic Supra-Laryngeal tonal words 


Annexure V 


S. No. Word IPA Stress | Stress | Stress $3-s2/s2*100 
(s1) (s2) (s3) “age increase of 
stress (Ss; ) 
1. | Fest /s6dola/ 65.54 67.59 64.52 -4.54 
2. suat i /p suri/ 69.71 68.98 66.35 -3.8 
3. | feegr /{idora/ 68.1 67.22 65.42 -2.68 
4. | Hoge /sodacon/ 66.17 67.18 65.57 -2.41 
5. | nde /3c8 ca/ 64.25 67.93 66.71 -1.79 
6. | faster /nibaona/ | 65.54 68.37 67.45 -1.34 
7. | fewer /nigarna/ 63.08 65.41 64.96 -0.68 
8. | gen /budapa/ 61.59 65.56 65.2 -0.55 
9. | SeEHU /hdddnsac/ | 61.14 67.09 67.44 0.52 
10. ) fors"@er /gidzauna/ | 60.67 66.54 67.43 1.34 
11. | srysr /tagore / 64.22 67.48 69.02 2.28 
12. | cpyger /pdgarna/ | 62.76 66.53 68.72 3.28 
13. | Guper /agana/ 62.83 66.95 70.07 4.65 
14. — sod’ na/ 64.12 68.19 71.43 4.75 
15. | gaer” /tfokor / 69.74 62.75 66.1 5.33 
16. fAcTT /priftatfac/ | 68.13 63.8 67.29 5.46 
17. | AsteTS /sadzidac/ | 59.21 62.06 65.89 6.18 
18. | feugar /nigorna/ 62.63 64.37 68.44 6.33 
19. | gg /g3@ la/ 61.93 67.47 72.15 6.93 
20. | Gear /to@ la/ 67.86 63.23 68.14 7.76 
21. | gars" /tfagra/ 68.36 61.11 66.32 8.53 
22. | Sus /tfSpp di/ 68.09 60.65 66.15 9.06 
23. | @xey /odzabog/ | 62.6 59.01 64.52 9.34 
24. | Sas" /pidgdzona | 70.61 61.5 69.64 13.23 
25. | Sgt aaa 61.65 63.27 72.02 13.83 
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S. No. IPA Stress | Stress | Stress $3-s2/s2*100 
(s1) (s2) (s3) “age increase of 
stress (Ss; ) 
26. /bud3dzan_ | 58.7 62.72 71.42 13.87 
a/ 
213 kre £6/ 65.08 62.83 71.92 14.47 
28. /t1d3dzana/ | 60.21 61.61 71.77 16.49 
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Tri-syllabic laryngeal words 


Annexure VI 


S. Syllabic weight Intra-syllabic 
No stress (si in 
Word s1 s2 s3 %) 
1. | AEPEF /solaba/ 66.43 67.81 62.37 -8.03 
2. | METS /kralarna/ 62.01 64.04 | 62.75 -2.02 
3. | fefSaH /itehas/ 61.53 | 64.86 | 63.92 -1.46 
4. | fenfsod /1ftehar/ 59.64 | 64.78 | 64.07 -1.09 
5. | waders /ohSkar/ 59.74 62.14 62.61 0.76 
6. | fexfsae /imtehan/ 64.65 | 66.21 66.91 1.05 
7. | BRITS (fehotut/ 67.67 67.58 68.29 1.05 
8. | HIS /sohela/ 63.81 63.79 | 65.05 1.98 
9. | fHISHE /schotomSd/ 66.37 | 66.38 | 68.14 2.65 
10.) FfsHE /sésuba/ 59.25 60.36 62.09 2.87 
11.) Hearse /soharna/ 64.63 63.03 64.92 3.00 
12.) geter /hukana/ 59.39 | 62.58 _ | 64.48 3.03 
13.) Jers /holara/ 59.82 60.86 62.90 3.35 
14.) Freeh /fahodi/ 63.43 63.13 65.25 3.37 
15.) fe /sehora/ 66.58 66.54 | 69.36 4.25 
16.) Hates /sohaita/ 64.25 | 63.10 | 65.81 4.30 
17.) aE /sohona/ 65.38 67.45 70.43 4.42 
18.) Has /sohara/ 63.82 62.40 65.25 4.57 
19.) SAM /hesiot/ 58.83 | 61.99 | 65.01 4.86 
20.) frige" /sindna/ 64.17 | 66.98 _ | 70.42 5.14 
21.) BBEr /kolsna/ 63.45 67.79 71.43 5.36 
22.| wifeiHet /ohisok/ 61.44 | 61.42 | 65.00 5.83 
23.) YaST /krullova/ 66.01 | 68.26 _| 72.90 6.80 
24.) fOHIS /himatfl/ 60.56 59.63 63.82 7.02 
25.) foartest /hikatti/ 58.76 | 59.48 | 63.95 7.51 
26.) epfet /kromani/ 59.33 64.62 710.46 9.03 
27.) AS /khorsva/ 63.98 | 63.74 | 69.99 9.81 
28.) feedd /vidord/ 58.60 | 57.60 __| 67.74 17.61 
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Poly-syllabic words 


Annexure VII 


S. No. Word Syllabic weight 
sl s2 s3 s4 s5 
1. wifefoorAns /okirrafil/ 58.52 59.53 60.67 57.58 | - 
2. JUZIS /rupatoron/ 55.43 57.99 60.85 60.31 | - 
3 — /k*atfevala/ 56.95 61.63 62.08 61.53 | - 
4. UMASS /pdkrivala/ 57.94 59.80 62.28 62.56 | 62.25 
5: DAS SST /gudzravala/ 54.22 51.92 60.39 62.07 | 62.20 
6. Sareet /nakabddi/ 57.07 61.28 60.08 60.95 | - 
es fFASAUE /ikvasapon/ 53.71 57.01 60.82 61.85 | - 
8. Achite|d /séimitor/ 58.04 62.21 61.10 62.21 | - 
9; Barsuet /ogorpati/ 51.81 58.96 60.73 62.69 | - 
10. | @AHSUET /udzoddpuna/ 57.40 58.04 57.13 60.30 | 63.39 
Il. | S3reet /tfetavoni/ 58.36 59.19 59.35 62.75 | - 
12. | poTH|eet 58.09 61.49 56.60 54.06 | 60.73 
/samradgzvadi/ 
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Chapter 6 — Correlation of Morpho-Syntactic features with lexical 
representation and its co-articulation 


Annexure I 

POS for Punjabi 

S. No. Categories Label Annotation 

Convention 
Top-level Sub-type - Sub-type - 
level 1 level 2 

1 Noun N N 

1.1 Common NN N_NN 
1.2 Proper NNP N__NNP 
1.4 Nloc NST N_ NST 
2 Pronoun PR PR 

2.1 Personal PRP PR_ PRP 
2.2 Reflexive PRF PR_ PRF 
2.3 Relative PRL PR _ PRL 
2.4 Reciprocal PRC PR _ PRC 
2.5 Wh-word PRQ PR _PRQ 
2.6 Indefinite PRI PR_PRI 

3 Verb Vv Vv 

3.1 Main VM V_VM 
3.1.2 Non-finite VNF V_VM_VNF 
3.1.3 Infinitive VINF V_VM_VINF 
3.1.4 Gerund VNG V_VM_VNG 
3.2 Auxiliary VAUX V__ VAUX 
4 Adjective JJ 

5 Adverb RB 

6 Demonstrative DM DM 

6.1 Deictic DMD DM DMD 
6.2 Relative DMR DM _ DMR 
6.3 Wh-word DMQ DM DMQ 
6.4 indefinite DMI DM_DMI 
7 Postposition PSP 
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S.No.| Categories Label Annotation Label Annotation 
Convention Convention 
8 Conjunction CC CC 
8.1 Co-ordinator CCD CC__CCD 
8.2 Subordinator CCS CC__CCS 
9 Particles RP RP 
9.1 Default RPD RP _RPD 
9.2 Classifier CE RP—CEL 
9.3 Interjection INJ RP_INJ 
9.4 Intensifier INTF RP_ INTF 
9.5 Negation NEG RP NEG 
10 Quantifiers QT QT 
10.1 General QTF QT_QTF 
10.2 Cardinals QTC QT_ QTC 
10.3 Ordinals QTO QT_QTO 
11 Residuals RD RD 
11.1 Foreign word RDF RD_ RDF 
11.2 Symbol SYM RD_ SYM 
11.3 Punctuation PUNC RD__PUNC 
11.4 Unknown UNK RD_UNK 
11.5 Echowords ECH RD_ECH 
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Chapter -8 
Research Findings and Future Work 


8 Research Goal 


The main objectives of the proposed research have been: 
i. Adaptation of the W3C PLS 1.0 for evolving a framework capturing Punjabi 
language phonological features. 
ii. Corroboration of the major linguistic aspects through analytical study of recorded 
speech signals for Punjabi Language. 
iii. Identification of the challenges for designing of web based Machine-Readable 
Pronunciation Lexicon Specification in XML. 


iv. Design of new lexeme elements to incorporate identified features. 


8.1 The Research Undertaken 


A given phoneme is not always pronounced the same way in every context. Therefore 
the concepts of articulatory phonetics need to be explored to model pronunciation. 
Machine-Readable Pronunciation Lexicon in Punjabi can be spawned by leveraging the 
existing W3C Pronunciation Lexicon Specification recommendations which are global 
in nature and need to be internationalized from this perspective. It is a step-in-step 
inter- disciplinary process which involves study of language specific phonological 
features using experimental phonetics. The specific emphasis was laid on the study of 
suprasegmental features of Punjabi to evolve a rule set leveraging the existing 
knowledge found in linguistic literature. The layered approach was adopted to verify 
the existing rules and discover exceptions. New knowledge base has been created to 
report and handle these exceptions by evolving additional rules to augment the 
machine learning approaches in speech processing. Thus the framework PLS 2.0 has 
been developed which can capture such model Punjabi lexicon pronunciation in global 


IPA standard. 
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The Extensible Markup Language (XML) which can be used both for machine and 
human consumption however machine readable pronunciation lexicon is the major 
outcome which will aid production of Punjabi speech systems. The phonology specific 
to the Punjabi language when systematically approached through experimental effort 
using computer-aided tools, can help discover the way sounds are differently realized 
in different environments as governed by the grammar of the language. The 
phonological rules thus discovered can be used for building computational models of 
phonological learning i.e. how the phonological rules can be automatically induced by 


machine learning algorithms. 


8.2 Evidence from Quantitative Analysis for Phonological Rules 


Quantitative research involves the use of computational, statistical, and mathematical 
tools to derive results. It is conclusive in its purpose as it tries to quantify the problem 
and understand how prevalent it is by looking for projectable results largely applicable. 
Thus the segmental and suprasegmental prosodic features require in-depth analysis for 


arriving at phonological rules. 


Segments, usually phonological units of the language, such as vowels and consonants, 
are of very short duration. A given feature may be limited to a particular segment but 
may also be longer (as a suprasegmental feature). Suprasegmental refers to a 
phonological property of more than one sound segment. Suprasegmental information 
applies to several different linguistic phenomena (such as pitch, duration, intensity and 
loudness). The data gathered was annotated at phoneme level for study of segmental 
features and at syllable level for examining the suprasegmental features. The 
parameters were recorded as discussed in the previous chapters. The proposed 
hypothesis was validated and variations reported. 

Tone is a very important feature of Punjabi language which makes it distinct from 
other Indo-Aryan languages. Hence an elaborate study of this has been carried out as 
discussed in chapters 2 & 3. Stress has not been considered a very crucial parameter by 


Punjabi linguists. 
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However it has been given due attention as it becomes relevant from machine 
perspective. Stress has been dealt at intra-syllabic level within a word which is 
meaningful for building a lexicon that can be utilized for artificial production of speech 
via text-to-speech tools. These tools can utilize this stress information to produce near 
human voice by machine learning of prosodic features incorporated in the data 
developed based on PLS 2.0. Similarly these features can be leveraged by speech 
recognition systems for attaining an acceptable level of efficacy in recognizing native 


speakers’ speech. 
The steps involved in quantitative research can be divided into: 


1. Current hypothesis based on literature survey 

2. Collection of appropriate data to verify the hypothesis 

3. Analysis of data to validate the hypothesis and report rules along with 
exceptions 


4. Evolve new hypothesis 


8.3. Research Findings 


8.3.1 Tones 


The observational experimental methodology as deliberated in chapter 3 was adopted 
to report the types of tones observed in Punjabi lexicon based on the slope pattern of 


the fundamental frequency of the tone bearing vowel (TBU). 


8.3.1.1 Verification and Validation 


The available hypothesis on high and low tones in Punjabi has by and large been 
corroborated in both types of tones viz tones arising from supra-laryngeal consonants 
and independent tones which have been experimentally verified as discussed in 


chapter 3. 
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8.3.1.2 Discovery of Allotones 


The level tone in Punjabi is not marked and existence of low (HL) and high tone (LH) 


is well recognized and are not represented orthographically. However Punjabi native 


speakers handle the tone variations in their speech naturally and predictably. 


Allotones are linguistically non-significant variants of tones but are considered 


important for the development of technologies such as speaker identification, 


language identification and speech recognition as these may vary from person to 


person and occasion to occasion. Two new allotones have been discovered viz LHL 


as an allotone of LH and HLH as an allotone of HL. This phenomenon has been 


noted in 50-70% of the speakers and very rarely in all speakers in a particular context 


as elaborated in the table below. 


Tone on Category of words Co-articulation Tone/Allotones 
vowel of (syllable under parameters in a syllable (percentage of 
syllable consideration) speakers) 
under 
consideration 
Monosyllabic Consonant /h/ as coda LH (50%); 
LHL (50%) 
Mono/di/ tri/poly- Toneme or conjunct LH (100%) 
syllabic containing /f/ as coda 
(initial syllable) 
Di/tri/poly-syllabic | Toneme or conjunct LH (100%) 
(medial syllable containing /fi/ as onset 
LH with short vowel as 
nucleus) 
Di/ tri/poly-syllabic | Toneme as onset LH (100%) 
(final open syllable) Tone shifts to 
nucleus (vowel) 
of prior syllable 
Tri/-syllabic Dipthong (long + long) LHL (100%) 
(final open syllable) 
Dy/ tri-syllabic Consonant /h/ or conjunct LH (70%); 
(final closed containing /fi/ as coda LHL (30%) 
syllable) 
Contd.. 
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Tone on Category of words Co-articulation Tone/Allotones 
vowel of (syllable under parameters in a syllable (percentage of 
syllable consideration) speakers) 
under 
consideration 
Monosyllabic Toneme as onset 
(closed syllable) Consonant /h/ as coda HL (100%) 
Any other consonant as HL (50%); 
coda HLH (50%) 
Monosyllabic Dipthong HLH (100%) 
(open syllable) 
Di/ tri/poly-syllabic | Toneme as onset. HL (100%) 
(initial syllable) 
HL Tri/poly-syllabic Toneme or conjunct 
(medial open containing /fi/ in the onset HL (100%) 
syllable and long Dipthong (long + short 
vowel as nucleus) vowel) 
Di-syllabic Toneme as onset HL (100%) 
(final closed 
syllable) Toneme as coda HL (60%), 
HLH (40%) 
Dipthong (short + long) and 
(long + long) and flap / HLH (100%) 
fricative / nasal coda 
Tri-syllabic Consonant /h/ or conjunct of 
(final open syllable) | /f/ as coda with dipthong HLH (100%) 
(long + long) 
Table 8/1: Research Findings on Tones/Allotones 
8.3.1.3 Extrapolation of the Existing Knowledge Base 
e The detailed experimental analysis of the co-articulation parameters examined 


through recording, 


annotation and quarter-wise slope observations of the 


fundamental frequency taking tone rich data covering large variety of phonetic 


contexts lead to indepth understanding of the tone patterns of the Punjabi language. 


° The visualization of the tone patterns using the scientific tools has 


corroborated the perceptual & X-ray studies done by linguists and added conviction. 
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e The speaker dependent reflection of allotones in certain acoustic contexts has 
also been discovered. 

« Tone in Punjabi gets exhibited on the associated vowel of tonemes (viz 
nucleus of the syllable containing toneme) however it has been discovered that it 
shifts to nucleus of prior syllable in the phonetic context of di/tri/poly-syllabic 
words having toneme as onset and having final open syllable. 

e The tone patterns of mono-syllabic words do not find much discussion in the 
literature. A significant amount of data was analyzed to report the findings. 

e Allotone HLH has been discovered in all the speakers in case of open 
dipthongal monosyllabic words, Final closed dipthongal di-syllabic words having 
flap/fricative/ nasal coda and dipthongal tri-syllabic words with toneme or 


Consonant /h/ or conjunct of /fi/ as coda. 


8.3.2 Lexical Stress 


Stress in Punjabi is distributed solely according to a pattern based on the syllables 
contained within a word. The linear regression technique was used to investigate the 
relationship between the outcome variable and multiple explanatory variables that are 
potentially correlated with each other. The statistical analysis of each parameter was 
carried out. The intra-syllabic stress was calculated to report the stress patterns in 
Punjabi lexicon across various word categories. The literature survey discusses the 
possibility of its occurrence on ultimate/penultimate syllable in a word however stress 


related information is not found in Punjabi dictionaries. 
The following research contributions were made: 


e Empirical formula for stress function was derived through Linear Regression 
analysis by modeling the relationship between the dependent variables viz 
pitch, duration & intensity to determine the extent of contribution each 
variable makes towards intra-syllabic stress which is significant research 


contribution as no quantitative research has been reported so far. 
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The empirical formula was used to calculate the syllable weight for each syllable 
in a word covering all the words. The heaviest syllable in each word was 


identified. 


e The statistical approach such as Normal Distribution was adopted to analyze 
this data and stress rules were evolved for each category of words by 
calculating Mean, Standard Deviation and plotting the normal distribution 
curve to report the stress marking rules in the Punjabi PLS data. 

e These stress rules are largely applicable depending on the position and context 
of syllables in a word : 

i. Stress on ultimate syllable (majorly applicable) 
ii. No stress (in case toneme is present in the initial syllable) 


iii. | Stress on penultimate syllable (discovered in 50 % of the polysyllabic 


words) 
Rule Rule for marking Condition 
No. Intra-syllabic stress 
R1 | Ultimate syllable Di/Tri/Poly syllabic non-tonal words 
and tonal words except words having 
toneme in initial syllable 
R2 | No Stress Di/Tri/Poly tonal words having 
toneme in initial syllable 
R3 | Penultimate syllable Some Poly syllabic words 


Table 8/2: Rules for Marking Intra-syllabic Stress 


8.3.3 Acoustic Variability of Schwa 


As discussed by linguists, schwa in Punjabi is a mid-central vowel as indicated in the 
vowel triangle shown below. No further study has been reported on variations in its 


acoustic properties. 
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Punjabi Vowels 


entrilised 
(ul 


Front 
Unrounde 


Fig 8/1: Vowel Triangle 


The schwa has been the subject of much research by phonologists globally. The 


schwa not written orthographically as a part of consonant cluster however 


phonetically it is realized as is observed through data annotation. The current Punjabi 


dictionaries also mark it in the pronunciation. The analysis has been done by taking 


the different classes of Phonetic context, which has lead to discovery of certain 


acoustic variations. 


Allophones Phonetic context Vowel height Vowel frontness 
of schwa(a) 
3 Nasalized Schwa Close-Mid Central 
de Schwa in Tri-syllabic Close-Mid Transition zone 
Words having front 
Geminated Toneme as 
onset 
2: Schwa as Release Approaching Near- | Transition zone 
Vowel in Isolated close front 
Words 


Table 8/3: Acoustic Variations of Schwa 
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Three allophones of schwa have been discovered: 


i. 9, is Close-Mid and Central 
il. | 9gis Close-Mid and lies on the rear border of transition zone front 
ili. 9, is approaching near-close and lies on the front border of transition zone 


front 


The augmented vowel triangle incorporating acoustic variations of schwa is as below: 


Punjabi Vowels 


Front Back 
Unrounde rounded 


Fig 8/2: Acoustic Variations of Schwa (augmented vowel triangle) 


The analysis of same set of words in a sentence revealed that the release vowel in a 


sentence is insignificant in comparision to its occurrence to isolated words. 


8.3.4 Pronunciation Lexicon Specification For Punjabi Language Within W3C 


Framework 


The World Wide Web Consortium (W3C) in 2008 recommended the machine 
readable pronunciation lexicon framework (PLS 1.0) which is being used globally 


with suitable language specific adaptations as discussed in section 1.7.2. 


8.3.4.1 Research Contributions 
Grammatical information in Punjabi is majorly encoded in morphology not syntax 


unlike English. 
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Therefore the Morpho-syntactic features of Punjabi were examined in correlation 
with the current PLS framework. Ten additional features were identified. 
i. New features: 


¢ Script — To provision for additional Shahmukhi script 


* Rootword 

* Stem 

° Prefix 

© Suffix Morpho-syntactic features 
° Inf 

* POS 


¢ Origin — To encode borrowed words 
* MWE — To accommodate compound, duplicate, echo, named entities 
etc. 


* Meaning — To differentiate homographs 


ii. Incorporation of features in PLS framework 


¢ Elements — Rootword, Stem, Suffix, Inf 


The primary information to represent Morphological features is rootword which will 
be treated as element. Similarly Stem, Suffix and Inf constitute the secondary 


information to be encoded as elements wherever applicable 


¢ Attributes — script, prefix, pos, origin, MWE , meaning 


The script attribute is required to have composite data of language having multiple 
scripts. The prefixes find limited use in lexicon hence can be incorporated as 
attributes. The pos attribute will be used to define the Part-of-speech of the 
rootword/stem element. The origin attribute would help in identification of borrowed 
words. The MWE attribute is required to accommodate multiple words entries which 
semantically need to be treated as single entity. The meaning attribute will 


differentiate homographs. 
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These new elements/attributes (represented in yellow colour) are proposed for 


addition in the current framework as presented below: 


Elements Attributes Description 
version 
xml:base 
xmins 

<lexicon> root element for PLS 
xml:lang 
alphabet 
xml:script 
name 

<meta> http-equiv element containing meta data 
content 

<metadata> element containing meta data 
xml:id the container element for a 

<lexeme> ; 
role single lexical entry 
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Elements Attributes Description 


prefer contains pronunciation 


<phoneme> 
alphabet information for a lexeme 


contains acronym 
<alias> Prefer expansions and orthographic 


substitutions 


contains an example of the 
<example> 
usage for a lexeme 


Table 8/4: Proposed PLS 2.0 Framework for Punjabi 


The sample data as per this framework covering various categories of words to give a 
representative set from completeness point of view and also as a guideline to develop 


machine readable PLS data has been presented in the previous chapter. 
8.4 Impact of Research Outcome on Speech Technologies in Punjabi 


All the above phonological research findings can be leveraged to implement a 
computational Phonology model for Punjabi language. The proposed PLS 2.0 
framework can be utilized to build large word level speech lexicon corpus containing 
prosodic information, syntax, and semantics that can be used for machine learning. 


The specific end use cases are discussed below. 
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8.4.1 Punjabi Text-to-Speech (TTS) Systems 


The Open-source Festival Engine or similar other engines are used to quickly build a 
TTS prototype which delivers synthetic speech difficult to comprehend by humans. 
The prototype can be made useable by incorporating prosody to realize human like 
speech. It may not be easy to have TTS prototype for Punjabi, as Punjabi is tonal 
which compulsorily requires implementation of prosodic feature of tone, therefore no 
Punjabi TTS has been developed so far unlike Hindi. The in-depth treatment given to 
tonal features of Punjabi in this thesis will enable speech researchers in developing a 
TTS prototype system. The incorporation of the other research outcomes of this thesis 


can help in getting TTS of useable quality. 
8.4.2 Language Identification Systems 


Fo and amplitude contours on a syllable-by-syllable basis are useful parameters. 
Language —specific prosodic cues such as stress, tone examined in this thesis can be 


utilized in Punjabi language identification. 
8.4.3 Speech Recognition Systems Based on Prosody 


Presently the speech recognition systems in Punjabi are in nascent stage. Prosody 
could be used to improve word recognition in ASR systems. Parameter such as pitch, 
intensity, and duration of different contexts has been reported in the thesis that will be 
utilized to generate speech vectors that can be optimized by Punjabi Speech 
recognition system. The work reported in this thesis can be used to develop language 
model and pattern matching probabilistic framework which makes use of these 
prosodic features of the word in question along with the information from word 


sequence associated. 
8.5 Future Research 


The foundational work done for Punjabi prosody in this thesis can provide a strong 


foundation for future research in following areas: 
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8.5.1 Extension of Work from Word to Sentence Level in Punjabi 

8.5.1.1 Intonation Study: The prosodic work currently done at the intra-syllabic 
level within a word can be extended by recording sentences for studying intonation, 
juncture etc. 

8.5.1.2 Co-articulation Modeling of Punjabi: The syllables having significant co- 
articulation features can be examined for capturing Morpho-Phonemic features which 
will help in reconstruction of the phonological knowledge from the speech stream. It 
may be desirable to capture such features for consonants (other than stops), semi 
vowels etc as lot of variation has been noted from the data analyzed. The spreading of 
nasal prosody can also be studied. 

8.5.1.3 Speaker Variation: The data can be used for further analysis for reporting 
variations among male and female speakers and also for capturing acoustic variations 
across 10 speakers. 

8.5.1.4 High Quality Acoustic Models in Punjabi: These rely on availability of 
large & reliably transcribed training sets that match the underlying distribution of 
speech in different acoustic environments. The large set of phonetically and 
prosodically rich data can be generated based on the sample data which can improve 
the word recognition accuracy. 

8.5.1.5 Rule Based Formant Synthesis: The Klatt synthesizer approach requires rule 
based approach for hand crafting of phonetic units for which PLS data can be 
utilized. This data can also be useful for unit selection synthesis. 

8.5.1.6 Language Identification: The tone patterns of Punjabi can further be 
investigated and based on tonal feature extraction from multilingual data stream, 
Punjabi language data can be segregated. 

8.5.1.7 Comparative Study of Vowel Features: The acoustic variations of schwa 
vowel have been reported. Similar study can be done for other vowels. 

8.5.1.8 Prosodic Features Based Modeling Techniques for Language 
Recognition: By capturing the prosodic features such as Fo, duration, Intensity etc. , 
the model that captures the prosodic information can be developed by using the 
various modeling techniques such as Neural Network, HMM, GMM, DNN, N-Gram , 


Histogram etc for Punjabi language speech recognition. 
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8.5.1.9 Extension of Work to other dialects of Punjabi: The similar data and 
analysis can be done for other dialects of Punjabi such as Majhi, Doabi and Lehndi. 

8.5.1.10 Extension of Work to other Indo-Aryan Languages: The other Indo- 
Aryan languages are phonetically similar to Punjabi but are non-tonal. The data could 
be recorded for other languages and similar analysis may be done to corroborate the 


findings for a specific language. 
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